Showing posts with label ml. Show all posts
Showing posts with label ml. Show all posts

Monday 20 November 2023

Evaluating the success of an AI&ML use case

              Data science team has finished development of the current version of the ML model & has reported an accuracy or error metric. But you are not sure how to put that number in context. Whether that number is good or not good enough.

                In one of my previous blogs, I have addressed the issue of AI investment and how long before the business can know if the engagement has some potential or it is not going anywhere. This blog can be considered an extension of the above-mentioned blog. If you haven’t checked it out already, please visit : https://anirbandutta-ideasforgood.blogspot.com/2023/07/investment-on-developing-ai-models.html

                In my previous blog, I spoke about the KPIs like Accuracy and Error as thumb-rules to quickly assess on the potential success of the use case. In this blog, I will try to add more specificity or relativeness to it.

                Fundamentally, to evaluate the latest performance KPI of your AI&ML model, there are 3 ways you can go about it, in independence or in combination.

Consider human level performance metric.

For AI use cases which has the primary objective of replacing human effort, this can be considered the primary success metric. For example, if for a particular process the current human error stands at 5%, and the AI can have less or equal to 5% error rate, it can be determined a valuable model. Because AI with the same error rate, bring along with it - smart automation, speeding up the process, negligible down-time etc.

Example: Tasks which needs data entry can easily be replicated by AI. But the success criteria for AI does not need to be 100% accuracy for adoption, but just must match the accuracy which the human counterpart was delivering, to be adopted for real word deployment.

Base Model metric

In use cases for which problem areas getting addressed are more theoretical in nature, or the discovery of the business problem that can get addressed is in progress, its best to create a quick simple base model and then try to improve the model with each iteration.

For example: Currently I am working on a system to determine if a content is created by AI or not. For the lack of any past reference based to which the accuracy can be compared, I have taken this approach to determine the progress.

Satisfying & optimizing metric

We outline both, a metric that we want the model to do as good as possible (we call this optimizing metric) while also meeting some minimum standard which makes it functional and valuable in real life scenarios (we call this satisfying metric)

Example: For Home Voice Assistant, the optimizing metric would be the accuracy of a model hearing exactly what someone said. The satisfying metric would be that the model does not take more than 100 ms to process what was said.

Monday 19 June 2023

AI, ML, Data Science - frequent consultation advises to leadership

 In this current post I have tried to compile most questions, discussions and queries I come across while consulting Data Science road maps with leaders and managers. I am hoping this compilation will add value to the other leaders and managers too who may have at some point wondered about them but didn’t get the opportunity to have those discussions. Many of you may have a better response or more exposure to some of the questions, I have tried to compile it based on the best knowledge I have and how I go about explaining them.

Please reach out, if you think you have a question or point that appears a lot during consultation and is worth discussed upon.

1. Should we build this product or capability in-house or get it from a vendor?
A vendor product will always be a generalized one to cut across as many businesses as possible as they thrive on repeat-ability. While when you build in-house you can make it more customized to a smaller set of use cases or scenarios and may be create a better differentiation.
So please ask yourself –
· When partnering with a vendor what role do I play? What stops the vendor from partnering with the business directly in future? What is my value addition, is there any risk I may become insignificant.
· What kind of team I have? If you have a great engineering team may be you want to do more stuffs in-house and keep a bigger piece of pie for yourself.
· What is my core capability? Is the capability needed in line with our core skill or is it something we want to learn, then maybe we should do it in house, or it is something we just want to get done, then may be the best way is to get a vendor involved.

2. We have created certain Analytics use cases. But we see several other teams also creating similar use cases.
Differentiation of analytics product or use cases are driven by each and combination of below –
a) Deep domain knowledge
b) Combination of data from different systems brought together on a dynamic big data system
c) Deep or mature algorithm applied
If your use cases are easy to replicate it’s most probably on a shallow data, with very general domain knowledge applied with basic Data Science techniques.

3. Are we using the kind of AI that is used for technologies like Self Driving car?
Yes and No. Internally all these technologies uses combinations of neural net and reinforcement learning. We also for different use cases have used variation of same and similar technologies. But technologies like self-driving car works on image or vision data, which we generally don’t do. Our use cases are mostly based on numerical, text and language processing data.

4. Vendor says their product is being used by Amazon. So should we go ahead and buy it?
May be it is being used by Amazon or any similarly big companies, but ask the vendor if their product is being used for a mission critical process or for some PoC or to store or process data like click-stream data which is not business critical. This makes all the difference, if the logos vendors show you are using the vendors technology for business critical projects or some non-critical processes.

5. We are showing the use case to the business but it’s not making much of an impact.
Story telling needs to be improved. Every analytics use case must be associated with a story that should end with the business making or saving money. If the story cannot relate how the engineering will improve the customer’s financial bottom-line, the customer business does not care about it, irrespective of how good the engineering is.

6. Now we have a data scientist in our team. Can we now expect more insights from our data?
Data Scientists alone cannot ensure project success. Data Engineers, Big Data and Cloud Infra Engineers are equally important part of the technical team. Without the infrastructure in place and data being stored in the infra in proper format, Data Scientists cannot do his or her magic.

7. We are finding very difficult to hire data scientists and big data developers.
Though there is no dearth of CVs, finding genuinely talented people with actual knowledge and production implementation knowledge is difficult. And among the few, most are already paid well and on good projects. So whenever a decision is taken to hire senior data science talents, a 6 month time frame should be kept in hand.

8. What is the difference between ML and AI?
Though you will find several answers to this in the internet one good way I have found to explain it to a business person, without the jargons is as below. By definition ML comes within the broader scope of AI. But to understand better and remember, Ai is something that is built to replicate human behavior. A program is called a successful AI when it can pass a Turing Test. A system is said to pass a Turing test when we cannot differentiate the intelligence coming from a machine and a human. ML is a system which you create to find pattern in a big data set that is too big for a human brain to comprehend. On a lighter note – if you ask a machine 1231*1156 and it answers it in a fraction of a second it is ML and if it pauses, makes some comment and answers after 5 mins, like a human, it is AI.

9. Why aren’t we using a big data Hadoop architecture but using RDBMS like MSSQL. Oracle.
RDBMS products like MSSQL, Oracle are still viable analytics products and are not replaceable by Big Data tools for many scenarios. Deciding on a data store or a data processing engine involves a lot of factors like ACID-BASE properties, type and size of data, current implementation, skill set of the technical team etc. So doing an analytics project does not make Hadoop or NoSQL product default.

10. Here is some data, give me some insight.
This is the first line of any failed initiative. A project which is not clear about the business problem it wants to solve is sure to fail. Starting on an analytics project without a clear goal in mind and for the sake of just adding a data science project to the portfolio and no road-map how this will eventually contribute to company goal, is a waste of resource and will only end in failure.

Saturday 29 April 2023

Guru Mantras – What works for a successful AI, ML & Data Science implementation

  • Value of the problem - Before you start solving your analytics use case, ask yourself – how significant the change will be for the business if you get the perfect answer to your question. If the change is not significant enough don’t even bother to start solving it. With enough data most questions can be solved but is it worth the effort ?
  • You get paid for the business solution not the technical engineering - The objective of your project should be a business problem or a strategic solution. If you see yourself solving a tactical or IT problem, remember you are impacting the means to an end but not the end.
  • We ourselves are the most advanced intelligence - Whenever you are thinking of a problem try to think how our brain would have solved it. Although our solutions are not as sophisticated as the brain they were all inspired by them Like when you basket a ball think how our brain considers different things like our height, distance from the basket, strength of the wind, our angle from the basket etc. and then determines the strength of our throw and how it gets better with time. So when you build the machine how it should process the same things and get better with time.
  • What is success ? Understand from your customer what success means to them. Define success as part of the project scope. Try not to promise any particular number that you will achieve as part of your algorithm outcome like 95% accuracy as part of the scope. Explain the algorithm outcomes. Try to explain what the algorithm does in a simplistic manner. The business will be much more open to include algorithmic outcomes as part of their decision making processes if they have some intuition of what the algorithm does.
  • Some MLs are black boxes. Understand that some ML models given enough data to train works. We actually tell it how to understand and process the data. We actually don’t know how internally it is differentiating. For example we are currently running a project to differentiate between a clean and messy room. And we have tremendous success but we really don’t know how it internally differentiates between the two.

  • Remember AI winter. AI winter is a period of reduced funding and interest in artificial intelligence research. The AI winter was a result of hype, due to over-inflated promises by developers, unnaturally high expectations from end-users, and extensive promotion in the media The term was coined by analogy to the idea of a nuclear winter. It has happened in 70 and 80s. So do not be pressurized to say yes to something which the business says they have read about or seen somewhere else. Understand there is a lot of false hype around and you should be sure what you are promising can be build in a feasible manner.
  • Cannot build castle on air Algorithms are almost as good as the data – in terms of quality and size, and the infra on which it runs. So before you engage that data scientist ask your self do I have the required data in terms of both size and format and in consistent quality available on my repository to run the algorithm on. And also do I have the processing power and big data infra set up to handle that processing. If you don’t have these, building the algo should not be your first priority. Do remember, many of the concepts we are using now were always there it’s our processing power which have brought them back to spotlight.

    • Don’t fight the big techs Use domain knowledge that you have gathered over years of being in business i.e. tricks of trade or domain business knowledge. It can be years of building cars, manufacturing things or making software products.
    1. · Avoid direct competition with the tech giants. If the product is too generic in nature they will build it faster and better with their deep resources.
    2. · Integrate data science closely with business products. Analytics should be out of the box and intuitive. If required collaborate with the tech giants’ offerings but never give away domain expertise.
    3. · Enable data science teams with domain knowledge. Celebrate people who are domain experts and make them part of the data science team.
    And we know big tech 4 – (Facebook, Amazon, Microsoft, Google) strengths.
    1. Deep funds available to do experimental innovation. Significantly less pressure to go profitable.
    2. Army of engineers and scientists.
    3. Vast computer infra. So vast that a company can rent their unused infra and it can become one of world’s biggest business (read AWS)
    • Systems are difficult to bring together I often see people talking about planning to bring systems together like taking walk on a park. It is rather like swimming across Arctic ocean. Think about bringing Twitter and Facebook together and identifying both are same person.In most cases the PII data cannot be used due to data security.Data level is a great challenge.
    • Few things are difficult to predict and action Stock market is one such example.
    1. It highly depends on the principle of game theory. That is you are very much dependent what others are doing. In a way you are trying to predict other’s behavior rather than the market.
    2. Lot of fake and misleading news in the net
    3. Influencing factors keep changing. We never knew Trump tweets can change the course of the market. But you can appreciate the trend in the long run.


    Saturday 18 February 2023

    Demystifying BERT - Things you should know before trying out BERT

     Introducing BERT


    What is BERT?
    BERT is a method of pre-training language representations, meaning that we train a general-purpose "language understanding" model on a large text corpus (like Wikipedia), and then use that model for downstream NLP tasks that we care about (like question answering). BERT outperforms previous methods because it is the first unsupervised, deeply bidirectional system for pre-training NLP.

    What makes BERT different?
    BERT builds upon recent work in pre-training contextual representations — including Semi-supervised Sequence Learning, Generative Pre-Training, ELMo, and ULMFit. However, unlike these previous models, BERT is the first deeply bidirectional, unsupervised language representation, pre-trained using only a plain text corpus (in this case, Wikipedia).

    Why does this matter? Pre-trained representations can either be context-free or contextual, and contextual representations can further be unidirectional or bidirectional. Context-free models such as word2vec or GloVe generate a single word embedding representation for each word in the vocabulary. For example, the word “bank” would have the same context-free representation in “bank account” and “bank of the river.” Contextual models instead generate a representation of each word that is based on the other words in the sentence. For example, in the sentence “I accessed the bank account,” a unidirectional contextual model would represent “bank” based on “I accessed the” but not “account.” However, BERT represents “bank” using both its previous and next context — “I accessed the ... account” — starting from the very bottom of a deep neural network, making it deeply bidirectional.

    The Strength of Bidirectionality If bidirectionality is so powerful, why hasn’t it been done before? To understand why, consider that unidirectional models are efficiently trained by predicting each word conditioned on the previous words in the sentence. However, it is not possible to train bidirectional models by simply conditioning each word on its previous and next words, since this would allow the word that’s being predicted to indirectly “see itself” in a multi-layer model.

    To solve this problem, we use the straightforward technique of masking out some of the words in the input and then condition each word bidirectionally to predict the masked words. For example:
    While this idea has been around for a very long time, BERT is the first time it was successfully used to pre-train a deep neural network.

    BERT also learns to model relationships between sentences by pre-training on a very simple task that can be generated from any text corpus: Given two sentences A and B, is B the actual next sentence that comes after A in the corpus, or just a random sentence? For example:
    How i extended BERT for chat bot

    Already pretrained BERT was fine tuned on SQUAD database.
    The model is pre-trained on 40 epochs over a 3.3 billion word corpus, including BooksCorpus (800 million words) and English Wikipedia (2.5 billion words).

    The model is fine tuned in Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles


    Deployment

    I used Flask - A web services' framework in Python, to wrap a machine learning Python code into an API.
    Training and Maintenance

    • Google BERT is a pre-trained model and there is no training involved.
    • You can fine tune it though like i did on SQUAD data set.
    • If you can spend some time on understanding the underlying code you can customize it to better suit your domain and requirement, like we did.
    • Once the code is deployed it needs to be constantly monitored and evaluated to understand improvement scope.
    • No day-to-day training is required.
    Infra spec

    Though the BERT pre-trained model should be able to run on any infra spec that is generally advised for any analytics use case, the infra that Google has advised for fine tuning is on the higher end by non-Google standard.
    (Though BERT without fine tuning is also efficient, fine tuning result in substantial accuracy improvements.)

    As per Google –

    • Fine-tuning is inexpensive. All of the results in the paper can be replicated in at most 1 hour on a single Cloud TPU, or a few hours on a GPU, starting from the exact same pre-trained model. SQuAD, for example, can be trained in around 30 minutes on a single Cloud TPU to achieve a Dev F1 score of 91.0%, which is the single system state-of-the-art.
    • All results on the paper were fine-tuned on a single Cloud TPU, which has 64GB of RAM. It is currently not possible to re-produce most of the BERT-Large results on the paper using a GPU with 12GB - 16GB of RAM, because the maximum batch size that can fit in memory is too small.
    • The fine-tuning examples which use BERT-Base should be able to run on a GPU that has at least 12GB of RAM using the hyperparameters given.
    • Most of the examples below assumes that you will be running training/evaluation on your local machine, using a GPU like a Titan X or GTX 1080.

    References I used for my learning and some content