Skip to main content

Posts

Showing posts from February, 2020

366DaysofDataScience Catalogue

No Day Date Topic link_category Link lag 1 2 10/11/19 some errors in SQL 2/6 github https://github.com/viswanathanc/SQL-for-Data-Science 1 2 14 10/23/19 Matplolib – Visualization kaggle https://www.kaggle.com/viswanathanc/beginner-to-intermediate-matplotlib-visualizing 12 3 22 10/31/19 Role of EDA in Model Building kaggle https://www.kaggle.com/viswanathanc/role-of-eda-in-model-building 19 4 27 11/05/19 Beginner to intermediate matplotlib visualization kaggle https://www.kaggle.com/viswanathanc/beginner-to-intermediate-matplotlib-visualizing 23 5 32 11/10/19 bunch to dictionary github https://github.com/viswanathanc/basic_python 27 6 33 11/11/19 Stratified sampling kaggle https://www.kaggle.com/viswanathanc/stratifiedshufflesplit-working-with-less-data?scriptVersionId=23291002 27 7 33 11/11/19 ...

Deep Learning (Goodfellow et al) - Chapter 2 review

     This chapter is the first chapter of the part - 1. This part prepares you for Deep learning and in this chapter Linear Algebra is covered. The author advice you to skip the chapter if you are already familiar with the concepts. Also it is not that you can become Gilbert Strang after reading this chapter! The chapter is so concise that what may be relevant to our future acquaintance with the book are only present here. Only at the end of the chapter some concepts author speaks on the application part. The focus is on mathematical concepts involving Linear Algebra.       Though Science and Engineering use Linear Algebra efficiently, Computer scientist have lesser experience with it. This is the motivation of the chapter and if you have experience it then author provides you with a reference (The Matrix Cookbook - Petersen and Pedersen, 2006) and waves you good bye to meet in the next chapter.      Definitions ...

Some Terminologies - Module, Package, Framework, API,..

    Following Article is a list of some terminologies and explanations in software development with a focus on Data Science. Module:       What can we do to reuse a function or a class that we have already written? Suppose we write a function to find Euclidean norm and the least square error while writing code for linear regression. We may need the same while writing code for K-nearest neighbors.      For the purpose of reusing the codes we can write such sharable functions in a single file. Lets save it as a 'lin_alg.py' and import it while we are writing code for linear regression and KNN.  import lin_alg lin_alg.least_sq_er(y,y_pred)  Package:       We can have a module to reuse code. But how much will it contain? There may be 100-200 functions or more used. We have two problems here, first is the space required to store every function. If we want to just calculate the euclidean ...

CodeSignal - Almost Strictly Increasing Sequence

I came across an interesting question in Code Signal.   "Given a sequence of integers as an array, determine whether it is possible to obtain a strictly increasing sequence by removing no more than one element from the array."                                                                                                         It is strictly increasing if every element in the array is greater than its successor.     For a strictly increasing sequence we can check for each element whether it is greater than the next. In that case we can come to a conclusion that this sequence is not strictl...