Buy new:
-10% $68.11$68.11
$3.99 delivery February 27 - March 4
Advertisement
Ships from: SuperBookDeals--- Sold by: SuperBookDeals---
Save with Used - Very Good
$59.93$59.93
FREE delivery Tuesday, February 24
Advertisement
Ships from: BooksRun Sold by: BooksRun
Sorry, there was a problem.
There was an error retrieving your Wish Lists. Please try again.Sorry, there was a problem.
List unavailable.
Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.
Read instantly on your browser with Kindle for Web.
Using your mobile phone camera - scan the code below and download the Kindle app.
The Essentials of Data Science: Knowledge Discovery Using R (Chapman & Hall/CRC The R Series) 1st Edition
Purchase options and add-ons
The Essentials of Data Science: Knowledge Discovery Using R presents the concepts of data science through a hands-on approach using free and open source software. It systematically drives an accessible journey through data analysis and machine learning to discover and share knowledge from data.
Building on over thirty years’ experience in teaching and practising data science, the author encourages a programming-by-example approach to ensure students and practitioners attune to the practise of data science while building their data skills. Proven frameworks are provided as reusable templates. Real world case studies then provide insight for the data scientist to swiftly adapt the templates to new tasks and datasets.
The book begins by introducing data science. It then reviews R’s capabilities for analysing data by writing computer programs. These programs are developed and explained step by step. From analysing and visualising data, the framework moves on to tried and tested machine learning techniques for predictive modelling and knowledge discovery. Literate programming and a consistent style are a focus throughout the book.
- ISBN-101138088633
- ISBN-13978-1138088634
- Edition1st
- PublisherChapman and Hall/CRC
- Publication dateJuly 17, 2017
- LanguageEnglish
- Dimensions6.1 x 0.7 x 9.1 inches
- Print length344 pages
Frequently purchased items with fast delivery
An Introduction to Statistical Learning: with Applications in R (Springer Texts in Statistics)PaperbackFREE Shipping by AmazonGet it as soon as Thursday, Feb 19
An Introduction to Statistical Learning: with Applications in Python (Springer Texts in Statistics)HardcoverFREE Shipping by AmazonGet it as soon as Thursday, Feb 19
Data Mining: Practical Machine Learning Tools and TechniquesPaperbackFREE Shipping by AmazonGet it as soon as Thursday, Feb 19Only 1 left in stock - order soon.
Data Mining: Concepts and Techniques (The Morgan Kaufmann Series in Data Management Systems)PaperbackFREE Shipping by AmazonGet it as soon as Thursday, Feb 19
Pandas for Everyone: Python Data Analysis (Addison-Wesley Data & Analytics Series)PaperbackFREE Shipping by AmazonGet it as soon as Thursday, Feb 19
The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second EditionHardcoverFREE Shipping by AmazonGet it as soon as Thursday, Feb 19
Editorial Reviews
Review
"I have several books on data science and R, as well as other similar subjects and programming languages, in my personal library. However, this book is a great blend of important data science topics and R programming that will make it a great reference for anyone working in this important and immensely popular area. I highly recommend this book for college students learning what it takes to start their career in data science or even current professionals wanting to make a career change or who just want to know more about the subject (and do some R programming as well)."
~Dean V. Neubauer, Techometrics
"Due to the self-contained introduction to many of the features of R and RStudio, Graham J. Williams The Essentials of Data Science, Knowledge Discovery Using R would make an excellent recommended or supplementary text for a course that plans to use the rattle package. This book would also serve as a great resource for those with an interest in data science who would like a hands-on approach to learning R and gettting a flavor for a handful of topics within data science."
~Katherine M. Kinnaird, Brown University
About the Author
Graham J. Williams is Director of Data Science with Microsoft and Honorary Associate Professor with the Australian National University. He is also Adjunct Professor with the University of Canberra. He was previously Senior Director of Analytics with the Australian Taxation Office, Lead Data Scientist with the Australian Government's Centre of Excellence in Data Analytics, and International Visiting Professor of the Chinese Academy of Sciences.
Over three decades , Graham has been an active machine learning researcher and author of many publications and software including Rattle. As a practitioner of data science he has deployed solutions in areas including finance, banking, insurance, health, education and government. He is also chair and steering committee member of international conferences in knowledge discovery, artificial intelligence, machine learning, and data mining.
Product details
- Publisher : Chapman and Hall/CRC
- Publication date : July 17, 2017
- Edition : 1st
- Language : English
- Print length : 344 pages
- ISBN-10 : 1138088633
- ISBN-13 : 978-1138088634
- Item Weight : 1.15 pounds
- Dimensions : 6.1 x 0.7 x 9.1 inches
- Part of series : Chapman & Hall/CRC The R
- Best Sellers Rank: #13,305,029 in Books (See Top 100 in Books)
- #1,678 in Business Statistics
- #2,216 in Data Mining (Books)
- #4,008 in Statistics (Books)
- Customer Reviews:
Products related to this item
Customer reviews
- 5 star4 star3 star2 star1 star5 star91%0%0%9%0%91%
- 5 star4 star3 star2 star1 star4 star91%0%0%9%0%0%
- 5 star4 star3 star2 star1 star3 star91%0%0%9%0%0%
- 5 star4 star3 star2 star1 star2 star91%0%0%9%0%9%
- 5 star4 star3 star2 star1 star1 star91%0%0%9%0%0%
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on AmazonTop review from the United States
There was a problem filtering reviews. Please reload the page.
- Reviewed in the United States on October 22, 2017Format: HardcoverI apologize in advance for my mediocre English, but this will not affect the meaning of my comment (of course) ;)
Then:
very disappointed
the only thing worthy of note is the template models, that can be usable for an "ordered workflow" although still optimizable
1) no reference to statistical notions behind visual analisys of variable (for example if possibily interaction or relation between two or more variable is
statistically significant --> no parametrics or non parametrics tests are exposed..)
2) visualization techniques are very standard...(for example more rappresentative approaches like mosaic display, contingency tables, stratified
analysis , sieve diagrams, association plot, and many others, are NOT mentioned..that are basis for categorical data analysis)
3) then no reference to handle significant analysis of categorical variables.
4) no reference to correctly handle missing values (only hint to randomforest's "na.roughfix" method that is a very elementary approach if totally inadvisable)
5) no reference to regression, time series or others reported topics in "Aim and Scope of the book"..and their assumptions.
6) totally elementary presentation of classification strategies and modelling.
7) excessive exposition of performance measures (only for classification tasks) that in the most kaggle challengies are unused..(except ROC curve).
8) unnecessarily repeating code for train, validation and test dataset
9) no reference to serious tuning methods in dedicated libraries , no optimization strategies are mentioned..
10) no relevant resampling staregies which represent an important part of the Machine Learning are reported
11) classic classification algorithms are presentated, like 'rpart', 'randomforest' and 'xgboost' in a EXTREME elementary way..any tutorial on github, kaggle, or any online resource is much superior.
12) what makes me laugh is also a recommendation that author make in "Exercise section" of Ensemble chapter:.."try also deep Neural networks as algorithm..
good...Deep learning is a beast.. it can not be handled or mastered even in a dedicated books, and he says: try deep learning...without even mentioning the algorithm or extremely new relevant libraries like TensorFlow framework and Keras (its wrapper available for Python and R)