Tanujit Chakraborty's Blog
  • HOME
  • BIOGRAPHY
  • RESEARCH
    • Research interests
    • PUBLICATIONS
  • TEACHING
    • DATA ANALYTICS (MBA)
    • MULTIVARIATE DATA ANALYTICS (MATH260)
    • MACHINE LEARNING (MATH370)
    • STATISTICAL INFERENCE (MATH350)
    • ANALYSIS (UG LEVEL)
  • MY TALKS
  • RESOURCES
    • LECTURE NOTES
    • BOOKS I WROTE
    • VIDEO LETURES
  • WORKSHOPS
    • WORKSHOP ON ML
    • TOUR OF AI
    • Workshop on Data Analytics
  • STAT & ML LAB
    • STAT & ML LAB
    • APPLY HERE
    • STUDENTS
  • ETC.
Course Name: Data Analytics (3-Credit Course) 2021
Participants: MBA (IB) 2020-22 Batch of Indian Institute of Foreign Trade (IIFT)
Faculty Name : Dr. Tanujit Chakraborty
Total Number of Students Registered : 191
Timeline : July 1, 2021 to September 30, 2021   |  ​Total Teaching Hours : 30 Hours (20 Sessions)
Email: tanujitisi@gmail.com 
Course Feedback from Students : 4.3 out of 5.0
Faculty Feedback from Students: 4.27 out of 5.0 
Feedback for Data Analytics Course from Students.pdf
File Size: 281 kb
File Type: pdf
Download File

Course Introduction:

This course is designed to equip business students with the capabilities in extracting implicit, previously unknown and potentially useful knowledge from real-world data sets. It provides practical training that enables immediate and effective participation in data analytics projects. The course includes an introduction to Data Science to address business challenges that leverage business data. The course provides grounding in basic and advanced analytic methods (both Statistical and Machine Learning techniques) and an introduction to big data analytics technology and tools. 

Course Objectives: ​

The participants will acquire the knowledge required for
​

1. Extracting insights through data summarization, aggregation and visualization methods.
2. Pre-processing the data for analytics and Decision making using Statistics methodologies.
3. Case studies and business domain specific applications of statistical and data analytics tools.
4. Developing models using statistical and machine learning techniques.
5. Generating actionable insights using unsupervised learning techniques.
6. Hands on experience on the usage of open source software like R and Python.

Evaluation Components:

 The evaluation components for the Data Analytics (DA) course will be as follows: 
1) Quiz - 20% ;  2) Mid Term - 30% ;  3) Class participation - 10% ;  4) End Term Test - 40%.

Textbooks: ​

Foundation books for this course are given below:

1. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An Introduction to Statistical Learning, Springer. (Read the Free Online Copy from here (Second Edition): https://www.statlearning.com/)
2. Friedman J, Hastie T, Tibshirani R. (2009). The Elements of Statistical Learning. New York: Springer series in statistics. (Read the Free Online Copy from here (Second Edition): https://web.stanford.edu/~hastie/ElemStatLearn/)
3. Burkov, Andriy. The Hundred-page Machine Learning Book. (2019). (Read Free Online Book here: http://themlbook.com/wiki/doku.php​)
Other Reference Books: ​
The class notes are available below. References are given in the lecture notes. Students may also refer to:
​
1. Chambers, John. (2008). Software for data analysis: programming with R. Springer Science & Business Media, 2008.
2. Provost, Foster, and Tom Fawcett. (2013). Data Science for Business: What you need to know about data mining and data-analytic thinking. O'Reilly.
3. Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: principles and practice. OTexts. (Read Online: https://otexts.com/fpp3/)
4. Heumann, Christian, and Michael Schomaker. (2016). Introduction to statistics and data analysis. Springer.
5. Shmueli, G., Bruce, P. C., Yahav, I., Patel, N. R., & Lichtendahl Jr, K. C. (2017). Data mining for business analytics: concepts, techniques, and applications in R. John Wiley & Sons.

Interesting Data Science Papers For Reading (Surveys / Critical Analysis) :

I would recommend all the participants to go through these research (mostly non-mathematical) papers along with the course. Please click on the paper name to view these outstanding and interesting paper: 
​
1. Statistics - What are the most important statistical ideas of the past 50 years? 
2. Data Science - 50 Years of Data Science​ 
3. Time Series Forecasting - Statistical and Machine Learning forecasting methods: Concerns and ways forward
4. Machine Learning - How to avoid machine learning pitfalls: a guide for academic researchers
5. Deep Learning - Tabular Data: Deep Learning is Not All You Need

Resources for Data Science Interview Preparations :

1. Cheatsheet on Statistics and Machine Learning
​Also see : Interview Experiences ( Link )
Cheat Sheets on Statistics, DS and ML.pdf
File Size: 15053 kb
File Type: pdf
Download File

2. Data Science Interview Preparation Guide
Also see : Preparation Tips ( Link )
Data_Science_Interview_Guide.pdf
File Size: 7760 kb
File Type: pdf
Download File

3. Data Science Interview Questions and Answers
Also see : Topic-wise Question and Answers ( Link )
Data_Science_Interview_Questions.pdf
File Size: 1722 kb
File Type: pdf
Download File

4. Big Data Interview Questions and Answers
Also see : Reading Material on Deep Learning ( Link )
Big_Data_Engineering_Interviews.pdf
File Size: 1568 kb
File Type: pdf
Download File

5. Tutorial Paper on Time Series Forecasting 
Also see : Data and Code Link ( Link )
Forecasting_Tutorial_Paper.pdf
File Size: 1094 kb
File Type: pdf
Download File

Lecture Notes:

Class notes (prepared using slides both in PPT and Latex) are taught ONLINE for IIFT MBA (IB) 2020-2022 students (3-Credit Course).
Session 1 :
​Topic: Introduction to Data Analytics
DA Session 1.pdf
File Size: 13912 kb
File Type: pdf
Download File

Session 2 :
​Topic: Descriptive Statistics & Probability Distributions
DA Session 2.pdf
File Size: 6291 kb
File Type: pdf
Download File

Session 3 :
​Topic: Sampling Distributions and Hypothesis Testing
DA Session 3.pdf
File Size: 4641 kb
File Type: pdf
Download File

Session 4 :
​Topic: Basics of R and RStudio
DA Session 4.pdf
File Size: 4593 kb
File Type: pdf
Download File

Session 5 :
​Topic: Hands-On Programming with RStudio
DA Session 5 Data_Code.zip
File Size: 226 kb
File Type: zip
Download File

Session 6 :
​Topic: Analysis of Variance (ANOVA)
DA Session 6.pdf
File Size: 4913 kb
File Type: pdf
Download File

Statistical_Tables.pdf
File Size: 10249 kb
File Type: pdf
Download File

Session 7 :
​Topic: Correlation and Regression Analysis
DA Session 7.pdf
File Size: 5365 kb
File Type: pdf
Download File

Session 8 :
​Topic: Statistical Modelling with RStudio
DA Session 8.pdf
File Size: 1142 kb
File Type: pdf
Download File

Session 9 :
​Topic: Hands-On Statistical Modelling with RStudio
DA Session 9 Data_Code.zip
File Size: 224 kb
File Type: zip
Download File

Session 10 :
​Topic: Logistic and Nonlinear Regression with RStudio
DA Session 10.pdf
File Size: 2930 kb
File Type: pdf
Download File

Session 11 :
​Topic: Pattern Classification and Bayesian Classifier
DA Session 11.pdf
File Size: 4833 kb
File Type: pdf
Download File

Session 12 :
​Topic: Similarity Measures and Sensitivity Analysis
DA Session 12.pdf
File Size: 5285 kb
File Type: pdf
Download File

Session 13 :
​Topic: Unsupervised Learning - Clustering Techniques
DA Session 13.pdf
File Size: 4153 kb
File Type: pdf
Download File

Session 14 :
​Topic: Supervised Learning - kNN and Decision Trees
DA Session 14.pdf
File Size: 1275 kb
File Type: pdf
Download File

Session 15 :
​Topic: Artificial Neural Networks and Deep Learning
DA Session 15.pdf
File Size: 14606 kb
File Type: pdf
Download File

Session 16 :
​Topic: Machine Learning and Deep Learning using Python
DA Session 16.pdf
File Size: 2760 kb
File Type: pdf
Download File

Session 17 :
​Topic: Hands-On Programming ML and DL with Python
DA Session 17 Data_Code.zip
File Size: 510 kb
File Type: zip
Download File

Session 18 :
​Topic: Time Series Forecasting
DA Session_18.pdf
File Size: 5007 kb
File Type: pdf
Download File

DA Session 18 Data_Code.zip
File Size: 3 kb
File Type: zip
Download File

Session 19 :
​Topic: Ensemble Method : XGBoost Algorithm
DA Session 19.pdf
File Size: 1033 kb
File Type: pdf
Download File

DA Session 19 Data_Code.zip
File Size: 7 kb
File Type: zip
Download File

Session 20 :
​Topic: Unsupervised Learning using R
DA Session 20.pdf
File Size: 1878 kb
File Type: pdf
Download File

DA Session 20 Data_Code.zip
File Size: 2 kb
File Type: zip
Download File

Contact Me

Dr. Tanujit Chakraborty 
Mathematics and Numerical Sciences Unit, Sorbonne Université. 
Researcher and Advisor at Sorbonne Center for Artificial Intelligence.  

Senior Fellow at Centre for Data Sciences, IIIT Bangalore, India.
​​Email: ctanujit@gmail.com​ 

Join Me

Join me on Twitter
Connect me on Linkedin
Code Repository at GitHub
Publications at Google Scholar
Research Profile at​ ResearchGate

​​© 2013 onwards - All rights reserved to Tanujit Chakraborty.​​

  • HOME
  • BIOGRAPHY
  • RESEARCH
    • Research interests
    • PUBLICATIONS
  • TEACHING
    • DATA ANALYTICS (MBA)
    • MULTIVARIATE DATA ANALYTICS (MATH260)
    • MACHINE LEARNING (MATH370)
    • STATISTICAL INFERENCE (MATH350)
    • ANALYSIS (UG LEVEL)
  • MY TALKS
  • RESOURCES
    • LECTURE NOTES
    • BOOKS I WROTE
    • VIDEO LETURES
  • WORKSHOPS
    • WORKSHOP ON ML
    • TOUR OF AI
    • Workshop on Data Analytics
  • STAT & ML LAB
    • STAT & ML LAB
    • APPLY HERE
    • STUDENTS
  • ETC.