Data Science with Jupyter , livre ebook

icon

231

pages

icon

English

icon

Ebooks

2019

Écrit par

Publié par

icon jeton

Vous pourrez modifier la taille du texte de cet ouvrage

Lire un extrait
Lire un extrait

Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus

Découvre YouScribe et accède à tout notre catalogue !

Je m'inscris

Découvre YouScribe et accède à tout notre catalogue !

Je m'inscris
icon

231

pages

icon

English

icon

Ebooks

2019

icon jeton

Vous pourrez modifier la taille du texte de cet ouvrage

Lire un extrait
Lire un extrait

Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus

Step-by-step guide to practising data science techniques with Jupyter notebooksKey features Acquire Python skills to do independent data science projects Learn the basics of linear algebra and statistical science in Python way Understand how and when they're used in data science Build predictive models, tune their parameters and analyze performance in few steps Cluster, transform, visualize, and extract insights from unlabelled datasets Learn how to use matplotlib and seaborn for data visualization Implement and save machine learning models for real-world business scenarios Description Modern businesses are awash with data, making data driven decision-making tasks increasingly complex. As a result, relevant technical expertise and analytical skills are required to do such tasks. This book aims to equip you with just enough knowledge of Python in conjunction with skills to use powerful tool such as Jupyter Notebook in order to succeed in the role of a data scientist. The book starts with a brief introduction to the world of data science and the opportunities you may come across along with an overview of the key topics covered in the book. You will learn how to setup Anaconda installation which comes with Jupyter and preinstalled Python packages. Before diving in to several supervised, unsupervised and other machine learning techniques, you'll learn how to use basic data structures, functions, libraries and packages required to import, clean, visualize and process data. Several machine learning techniques such as regression, classification, clustering, time-series etc have been explained with the use of practical examples and by comparing the performance of various models. By the end of the book, you will come across few case studies to put your knowledge to practice and solve real-life business problems such as building a movie recommendation engine, classifying spam messages, predicting the ability of a borrower to repay loan on time and time series forecasting of housing prices. Remember to practice additional examples provided in the code bundle of the book to master these techniques.Who this book is forThe book is intended for anyone looking for a career in data science, all aspiring data scientists who want to learn the most powerful programming language in Machine Learning or working professionals who want to switch their career in Data Science. While no prior knowledge of Data Science or related technologies is assumed, it will be helpful to have some programming experience.Table of contents1. Data Science Fundamentals2. Installing Software and Setting up3. Lists and Dictionaries4. Function and Packages5. NumPy Foundation6. Pandas and Dataframe7. Interacting with Databases8. Thinking Statistically in Data Science9. How to import data in Python?10. Cleaning of imported data11. Data Visualization12. Data Pre-processing13. Supervised Machine Learning14. Unsupervised Machine Learning15. Handling Time-Series Data16. Time-Series Methods 17. Case Study - 118. Case Study - 219. Case Study - 320. Case Study - 4About the authorPrateek is a Data Enthusiast and loves the data driven technologies. Prateek has total 7 years of experience and currently he is working as a Data Scientist in an MNC. He has worked with finance and retail clients and has developed Machine Learning and Deep Learning solutions for their business. His keen area of interest is in natural language processing and in computer vision. In leisure he writes posts about Data Science with Python in his blog.
Voir icon arrow

Publié par

Date de parution

20 septembre 2019

EAN13

9789389423709

Langue

English

Poids de l'ouvrage

1 Mo

Data Science with Jupyter
Master Data Science skills with easy-to-follow Python examples
by Prateek Gupta
FIRST EDITION 2019
Copyright © BPB Publications, India
ISBN: 978-93-88511-377
Website: https://bpbonline.com/
 
All Rights Reserved. No part of this publication may be reproduced or distributed in any form or by any means or stored in a database or retrieval system, without the prior written permission of the publisher with the exception to the program listings which may be entered, stored and executed in a computer system, but they can not be reproduced by the means of publication.
LIMITS OF LIABILITY AND DISCLAIMER OF WARRANTY
The information contained in this book is true to correct and the best of author’s & publisher’s knowledge. The author has made every effort to ensure the accuracy of these publications, but cannot be held responsible for any loss or damage arising from any information in this book.
All trademarks referred to in the book are acknowledged as properties of their respective owners.
Distributors:
BPB PUBLICATIONS 20, Ansari Road, Darya Ganj New Delhi-110002 Ph: 23254990/23254991
MICRO MEDIA Shop No. 5, Mahendra Chambers 150 DN Rd. Next to Capital Cinema, V.T. (C.S.T.) Station, MUMBAI-400 001 Ph: 22078296/22078297
DECCAN AGENCIES 4-3-329, Bank Street, Hyderabad-500195 Ph: 24756967/24756400
BPB BOOK CENTRE 376 Old Lajpat Rai Market, Delhi-110006 Ph: 23861747
 
 
Published by Manish Jain for BPB Publications, 20 Ansari Road, Darya Ganj, New Delhi-110002 and Printed by him at Repro India Ltd, Mumbai
About the Author
Prateek Gupta is a seasoned Data Science professional with 6+ years of experience in finding patterns, applying advanced statistical methods and algorithms to uncover hidden insights and maximize revenue, profitability and ensure efficient operations management. He has worked with several multinational IT giants like HCL, Zensar and Sapient.
He is a self-starter and committed data enthusiast with expertise in e-commerce domain. He has also helped clients like NTUC Singapore and Times Group India with his machine learning expertise in automatic product categorization, sentiment analysis, customer segmentation and recommendation engine. He is a staunch believer of the premise "Hard work triumphs talent when talent doesn’t work hard".
His keen area of interest is in the areas of cutting-edge research papers on machine learning and applications of natural language processing in various industry sectors. In his leisure time, he enjoys sharing knowledge through his blog and motivates young minds to enter the exciting world of Data Science.
His Blog: http://dsbyprateekg.blogspot.com/
His LinkedIn Profile: www.linkedin.com/in/prateek-gupta-64203354
Preface
Today, Data Science has become an indispensable part of every organization for which employers are willing to pay top dollars to hire skilled professionals. Due to the rapidly changing needs of the industry data continues to grow and evolve and thereby increasing the demand for data scientists. However, the questions that continuously haunt every company - are there enough highly-skilled individuals who can analyse, how much data will be available, where will it come from, and what the advancement in analysis techniques to serve them greater insights? If you have picked up this book, you must have already come across the above through talks or blogs from several experts and leaders in the industry.
To become an expert in any field, everyone must start from a point to learn. This book is designed keeping such perspective in mind in order to serve as your starting point in the field of data science. When I started my career in this field, I had little luck finding a compact guide which I could use to learn concepts of data science, practise examples and revise them when faced with similar problems at hand. I soon realized Data Science is a very vast domain and having all the knowledge in a small version of a book is highly impossible. Therefore, I decided to accumulate my experience in the form of this book where you’ll gain essential knowledge and skill set required to become a data scientist without wasting valuable time finding material scattered across the internet.
I planned the chapters of this book in a chained form. In the first chapter you will be familiarized with the data and the modern data science skill set. The second chapter is all about setting up tools for the trade with the help of which you can practise the examples discussed in the book. From chapter three to six you will learn all types of data structures in Python which you will use in your day-today data science jobs. The eighth chapter of this book will teach you most often used statistical concepts in data analysis. By ninth chapter, you will be all set to start your journey of becoming a data scientist by learning how to read, load and understand different types of data in Jupyter notebook for analysis. The tenth and eleventh chapter will guide you through different data cleaning and visualizing techniques.
From twelfth chapter onwards, you will have to combine knowledge acquired from previous chapters to do data pre-processing of real-world use-cases. In the chapters thirteen and fourteen you will learn supervised and unsupervised machine learning problems and how to solve them. Chapters fifteen and sixteen will cover time series data and will teach you how you can handle them. After covering the key concepts I have included four different case studies where you will apply all the knowledge acquired and practise solving real-world problems.
This book is my humble effort to cover fundamentals of Data Science using Python and save the readers’ time focussing on practical examples rather than just theory. These practical examples include real-world datasets and real problems which will make you confident to tackle similar or related data problems. I hope you will find this book valuable and it will enable you to extend your data science knowledge as a practitioner in a quick time.
Acknowledgements
I would like to thank some of the brilliant knowledge sharing minds - Jason Brownlee PhD, Hugo Bowne-Anderson and Filip Schouwenaars with whom I have learnt and am still learning many concepts. I would also like to thank open data science community Kaggle and various data science blogs authors in Medium for making data science and machine learning knowledge available to everyone.
I would also like to express my gratitude to the almighty God, my parents, my wife Pragya and my brother Anubhav for being extremely supportive throughout my life and the writing of this book.
Many thanks to the BPB Publications who made this book possible: Manish Jain, Nrip Jain, Varun Jain and many thanks to others who worked behind the scenes.
Finally, I would also like to thank Vinay Argekar, who served as the book’s acquisition editor, content reviewer and technical editor for improving the content day by day.
Downloading the code bundle and colored images:
Please follow the link to download the Code Bundle and the Colored Images of the book:
https://rebrand.ly/ab68d
Errata
We take immense pride in our work at BPB Publications and follow best practices to ensure the accuracy of our content to provide with an indulging reading experience to our subscribers. Our readers are our mirrors, and we use their inputs to reflect and improve upon human errors if any, occurred during the publishing processes involved. To let us maintain the quality and help us reach out to any readers who might be having difficulties due to any unforeseen errors, please write to us at :
errata@bpbonline.com
Your support, suggestions and feedbacks are highly appreciated by the BPB Publications’ Family.
Table of Contents
1.    Data Science Fundamentals
What is Data?
What is Data Science?
What a Data Scientist actually do?
Real world use cases of Data Science
Why Python for Data Science?
2. Installing Software and Setting up
System Requirements
Downloading the Anaconda
Installing the Anaconda in Windows
Installing the Anaconda in Linux
How to install a new Python library in Anaconda
Open your notebook- Jupyter
Know your notebook
3. Lists and Dictionaries
What is list?
How to create a list?
Different list Manipulation operations
Difference between lists and tuples
What is dictionary?
How to create a dictionary?
Some operations with dictionary
4. Function and packages
Help() function in Python
How to import a Python package?
How to create and call a function?
Passing parameter in a function
Default parameter in a function
How to use unknown parameters in a function?
Global and Local variable in a function
What is Lambda function?
Understanding main in Python
5. NumPy Foundation
Importing a NumPy package
Why NumPy array over List?
NumPy array Attributes
Creating NumPy arrays
Accessing element of a NumPy array
Slicing in NumPy array
Array Concatenation
6. Pandas and Dataframe
Importing Pandas
Pandas Data Structures
.loc[] and .iloc[]
Some Useful DataFrame Functions
Handling missing values in DataFrame
7. Interacting with Databases
What is SQLAlchemy?
Installing SQLAlchemy Package
How to use SQLAlchemy?
SQLAlchemy Engine Configuration
Creating A Table In Database
Inserting Data In Table
Update a record
How to join two tables
8. Thinking Statistically in Data Science
Statistics in Data Science
Types of Statistical data/variables?
Mean, Median and Mode
Basics of Probability
Statistical Distributions
Pearson Correlation Coefficient
Real World Example
Statistical Inference and Hypothesis Testing
9. How to import data in Python?
Importing txt data
Importing csv data
Importing Excel data
Importing JSON data
Importing pickled data
Importing a compressed data
10. Cleaning of imported data
Know your data
Analysing Missing Values
Droppi

Voir icon more
Alternate Text