Sunday 5 July 2015

10 THINGS TO KEEP IN MIND FOR AN ANALYTICS PROJECT

  1. Before you start solving your analytics use case, ask yourself – how significant the change will be for the business if you get the perfect answer to your question. If the change is not significant enough don’t even bother to start solving it.
  2. The objective of your project should be a business problem or a strategic solution. If you see yourself solving a tactical or IT problem, remember you are impacting the means to an end but not the end.
  3. Decide what you want to do – assign a weight age to a factor customer already knew about or give insight on a factor customer didn’t knew affects his business. For example, while doing analytics on student’s attendance, an insight for the former case would be – whenever it rains the student’s attendance drops by 27 %.Here the school always knew, rain has an adverse effect on the attendance but they never knew it was 27 %.(This can appear as contradicting to point 2, but for some cases that is what specifically asked by the customer. But whenever possible try avoiding it.)For the latter case it would be whenever there is a bank holiday student’s attendance will be negatively affected. This is something the school never knew about.
  4. Always make sure you ask your customers deviation percentage to the actual business value that can be considered as prediction success. Because as good as your analytics may be, you will never be able to capture all the influencing factors affecting his business numbers. For example – The customer can say ‘if your predicted sales is 5% on either sides of my actual sales I will consider it as correct.’
  5. While doing analytics for your customer, whenever possible, try avoiding giving your insight as an absolute value .Because no matter how many factors you may have included in your analysis, there will always be those unknown ones, which can turn your prediction wrong. Rather try to rank the factors influencing customers business based on their influence. The business thus can plan better what to concentrate on as priority.
  6. Ask the customer what is the offset of an event. Means what is the lag between an event occurring and its results getting reflected. For example an ad campaign being launched and the timeline around which the sales gets lift may have an offset of 2 months among them. This changes from product to product, depending on the factors and results. For some it may even be instantaneous.
  7. Try to understand from your customer what does significant change means to him. A value may be significant change to one customer but not to another.
  8. Don’t try to pick the use case, pick a use case. When you are talking to the business, let the business choose the use case for you. Just provide them the below matrix.
  9. Whichever use case you choose to implement will have multiple source systems of data. For most of the cases it’s not possible to include every source system as part of the analytics. To decide which source systems to spend your time on, use the below graph –  
  10. Don’t try to answer all the business questions. Rather try to give insights which will enable the business to ask more questions. No one will know his business more than the business owner.

Wednesday 18 February 2015

HADOOP – HOT INTERVIEW QUESTIONS - What is the responsibility of name node in HDFS?

  1. Name node is the master daemon for creating metadata for blocks stored on data nodes.
  2. Every data node sends heartbeat and block report to Name node.
  3. If Name node does not receive any heartbeat then it simply identifies that the data node is dead. The Name node is the single point of failure. Without Name node there is no metadata and the Job Tracker can’t assign tasks to the Task Trackers.
  4. If Name node goes down HDFS cluster in inaccessible. There is no way for the client to identify which data node has free space as there is no metadata available in the data node.