πŸ€–Tech Stack

AI ON CRYPTOCURRENCY

Cryptocurrencies are a fascinating phenomenon in today's world. Digital coins, which have swiftly gained popularity and become extremely popular, continue to be a highly profitable investment tool, capable of earning massive returns on cryptocurrency exchanges or when investing in these assets over the long run.

Many people are seeking to use Machine Learning, a type of AI, to finance their businesses. AI's strong suit is pattern recognition. Models can be trained to tell the difference between an apple and a pear, for example.

So, for example we have developed an AI model that can recognize patterns in coins trend, it can forecast the demand of the coins. This demand forecasting helps in knowing the coins inflation and deflation status.

OUR AI MODELS

We developed various models for various problem statements that helps the customers in tokenomics or crypto world. We used deep learning and NLP algorithms for developing appropriate models.

Our Score Card - Sentimental Analysis

Our Score Card - Fundamental Analysis

We scraped or extracted the data from different social media or different data sources like twitter, Facebook, telegram, discord.

GIVING THE SCORE OR RANK TO THE PAGE (SOCIAL MEDIA ACTIVITY)

The collected data has got various features or attributes using those attributes we derived few other attributes and giving proper weightage to those attributes which helped in building the model appropriately.

Derived features like Total number of Followers, Total number of posts, Total number of Engagements and Engagement rate. Based on some threshold values classified the data as Excellent, Good, Average and week etc.

Along with this we performed sentimental analysis on few features like tweets where it results in the sentiment whether the tweet is positive or negative.

Sentiment analysis is the application of artificial intelligence (AI) and natural language processing to examine people's feelings or thoughts on a posted tweet.

We derived the sentiment attribute to derive the score, we used transformer models like hugging face model for generating sentiment and confidence level of that particular sentiment. These transformer models are pretrained models that are already trained with huge datasets. We made use of those parameters, functions and attributes to achieve this sentimental analysis.

This Sentimental Analysis involves many steps in between process like, stop words removal, stemming, lemmatization, TF-IDF vectorization, intents, bag of words etc.

Using all these features we derived the scores and rank with right weightages.

BERT (Bidirectional Encoder Representations from Transformers): a transformer-based model that uses a bidirectional approach to create word embeddings and has achieved state-of-the-art results on many NLP tasks like sentimental analysis.

After deriving the sentimental analysis along with the positive labels, negative labels and the existing features we have derived metrics based on some non-linear functions and assumptions-based weightages score has been derived.

SOCAIL MEDIA LEGITAMACY.

The data extracted is an un-labelled data, so we used unsupervised learning for classifying whether the page is Real or Fake by using Followers legitimacy and Engagement Legitimacy.

We are classifying its rank using the real followers and fake followers proportionality in Followers legitimacy, and Real engagements and Fake engagements proportionality, These proportionalities derived using the AI algorithms like k-means and K-nearest neighbourhood algorithms.

K-means algorithm is an iterative algorithm that tries to partition the dataset into K-pre-defined distinct non-overlapping subgroups (clusters) where each data point belongs to only one group. It tries to make the intra-cluster data points as similar as possible while also keeping the clusters as different (far) as possible. It assigns data points to a cluster such that the sum of the squared distance between the data points and the cluster’s centroid (arithmetic mean of all the data points that belong to that cluster) is at the minimum. The less variation we have within clusters, the more homogeneous (similar) the data points are within the same cluster

We used k-means clustering or k-nearest neighbouring algorithms for developing a model. Before applying the algorithm to the data few attributes has to be derived like frequency of tweets in a day or in a week or on regular time period, similarly frequency of retweets and replies count has to be calculated for the same segmentation like number of retweets in a week or number of replies in a week etc

After deriving all the attributes, important features has to considered and to the final dataset algorithms has to be applied which helps in differentiating the real or fake pages based on the patterns.

TOKENOMICS:

DEFLATIONARY COINS

Demand forecasting of cryptocurrency coins can be a challenging task, as their prices can be highly volatile and subject to many external factors

Tract the list of transactions happening and if the token burns are happening in the buy/sell transactions or if the dead wallet balance is increasing by any means that reduces the total supply of the tokens will be considered as deflationary token.

We use web3 based events and track the deflationary behavior.

TOKEN UNLOCKS

Token unlocks refer to the amount of a token that is yet to be released or become available in circulation. There are several factors to consider when calculating token unlocks:

Locked Supply: The locked supply is the number of tokens that are not yet available for trading. This includes tokens that are still held by the project team, tokens that are subject to vesting schedules, or tokens that are locked in smart contracts.

We get the total supply, circulating supply values from the token contract using web3. We get the vesting data from tokens whitepaper to calculate the locked tokens. Then we use a certain mathematical formula to calculate the percentage of total supply that is yet to be unlocked.

TOKEN STAKING RATIO

We can calculate the token staking ratio by dividing the total number of tokens staked by the total supply of tokens.

The formula is: staking ratio = (total number of tokens staked) / (total supply of tokens)

To get the total number of tokens staked, we need to get the token balance in the staking contract with Web3 (balance of the staking contract in terms of the token). To know the staking contract of the token, we need to parse the whitepaper. To get the total supply of tokens, we can call the token contract's totalSupply() method.

Then we predict the rank based on the resultant value that we got. If the Stake ratio is less than 5% then it is weak, if it is in between 5-10% then it is average, if it is between 10-20% then it is classified as Good, finally above 20% is Excellent, where this helps the token price to get increased and maintains the crypto network health properly.If the Stake ratio is less than 5% then it is weak, if it is in between 5-10% then it is average, if it is between 10-20% then it is classified as Good, finally above 20% is Excellent, where this helps the token price to get increased and maintains the crypto network health properly.

MARKET ACTIVITY:

AV DAILY TRADING VOLUME

Average daily trading volume (ADTV) is the average number of shares traded within a day in a given stock.

To fetch the average daily trading value of a cryptocurrency using the Coingecko API provider. Getting the cryptocurrency ID using the Coingecko API. Use the cryptocurrency ID to get the market data. This will give us the historical market data for the cryptocurrency, including the daily trading volume.

Then calculate the average daily trading value by dividing the total trading volume over a given time period by the number of timestamps in that period.

If the value is less than 100k$ we are classifying it as a weak, if it lies between 100k$-500k$ it is classified as Average, similarly 500k$-1 million$ is Good and above 1 million is Excellent.

NUMBER OF HOLDERS INCREASING OR DECREASING

To get the number of token holders we are using scraping algorithms. After fetching the current number of holders, comparing with the number of holders value at a certain (past) timestamp.

If it is greater value on the comparison, ranking it as β€˜Increasing’ or else showing β€˜Decreasing’.

CONTRACTS SECURITY:

PREDICTING WHETHER HONEYPOT/RUGPULL OR NOT

HoneyPot - we are using the standard API provider Go Plus Token_security api’s. Using their api fetching the required parameters of the token to identify has honeypot or not.

Rugpull - we are checking the following parameters to state whether the token is rug pull or not.

  1. Weather the token contract is Verified or not

  2. Weather it has Proxy or not

  3. It is anti whale or not

  4. It is mintable or not.

To check these parameters statuses, we are using Go plus security API source.

LIQUIDITY

Using the standard API provider Go Plus Token_security api. Getting all the pool addresses/ pairs in which the token is pooled. After fetching all the pair addresses, getting liquidity value by calling the reserves method of the pair contract using Web3. Then calculating the sum of all the liquidities of pools.

If the total liquidity is less than 50k $ it is classified as weak, in between 50k$-100k$ it is Average, if it is between 100k$-250k$ it is Good and above 250k$ is Excellent. If the total liquidity is less than 50k $ it is classified as weak, in between 50k$-100k$ it is Average, if it is between 100k$-250k$ it is Good and above 250k$ is Excellent.

DOXED TEAM

This is to predict whether team is Public team or Anonymous team. We developed a model using Random Forest Classifier. We analysed the past data like different aspects of a team, including their past performance, development progress, social media activity, and market sentiment to make predictions.

We picked the important attributes or features using the Random Forest algorithm and normalized those attributes to balance the attributes which helps in maintaining all the features on the same scale, once the model is built, we evaluate using different classification metrics and using cost function we back propagate the process for weight adjustments until the model is converged.

Always we need to build the generic model, this generic model classifies the unseen data properly, specific models lead to overfitting, which works fine on seen data or train data but fails to work properly on unseen data.

Last updated