Big Data Using Hadoop and Hive , livre ebook

icon

206

pages

icon

English

icon

Ebooks

2021

Lire un extrait
Lire un extrait

Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus

Découvre YouScribe et accède à tout notre catalogue !

Je m'inscris

Découvre YouScribe et accède à tout notre catalogue !

Je m'inscris
icon

206

pages

icon

English

icon

Ebooks

2021

Lire un extrait
Lire un extrait

Obtenez un accès à la bibliothèque pour le consulter en ligne En savoir plus

This book is the basic guide for developers,architects, engineers, and anyone who wants to start leveraging the open-sourcesoftware Hadoop and Hive to build distributed, scalable concurrent big data applications. Hive will be used for reading, writing, and managing the large, data set files. The book is a concise guide on getting started with an overall understanding onApache Hadoop and Hive and how they work together to speed up development with minimal effort. It will refer to simple concepts and examples, as they are likely to be the best teaching aids. It will explain the logic, code, and configurations needed to build a successful, distributed, concurrent application, as well as the reason behind those decisions.FEATURES:Shows how to leverage the open-source software Hadoop and Hive to build distributed, scalable, concurrent big data applications Includes material on Hive architecture with various storage types and the Hive query language Features a chapter on big data and how Hadoop can be used to solve the changes around itExplains the basic Hadoop setup, configuration, and optimization
Voir icon arrow

Date de parution

24 mars 2021

EAN13

9781683926443

Langue

English

Poids de l'ouvrage

7 Mo

BIGDATAª USINGHADOOPª ANDHIVE
LICENSE, DISCLAIMER OF LIABILITY, AND LIMITED WARRANTY
By purchasing or using this book (the “Work”), you agree that this license grants permission to use the contents contained herein, but does not give you the right of ownership to any of the textual content in the book or ownership to any of the information or products contained in it.This license does not permit uploading of the Work onto the Internet or on a network (of any kind) without the written consent of the Publisher. Duplication or dissemination of any text, code, simulations, images, etc. contained herein is limited to and subject to licensing terms for the respective products, and permission must be obtained from the Publisher or the owner of the content, etc., in order to reproduce or network any portion of the textual material (in any media) that is contained in the Work.
MERCURYLEARNINGANDINFORMATION(“MLI” or “the Publisher”) and anyone involved in the creation, writing, production, accompanying algorithms, code, or computer programs (“the software”), and any accompanying Web site or software of the Work, cannot and do not warrant the performance or results that might be obtained by using the contents of the Work. The author, developers, and the Publisher have used their best efforts to ensure the accuracy and functionality of the textual material and/or programs contained in this package; we, however, make no warranty of any kind, express or implied, regarding the performance of these contents or programs. The Work is sold “as is” without warranty (except for defective materials used in manufacturing the book or due to faulty workmanship).
The author, developers, and the publisher of any accompanying content, and anyone involved in the composition, production, and manufacturing of this work will not be liable for damages of any kind arising out of the use of (or the inability to use) the algorithms, source code, computer programs, or textual material contained in this publication. This includes, but is not limited to, loss of revenue or profit, or other incidental, physical, or consequential damages arising out of the use of this Work.
The sole remedy in the event of a claim of any kind is expressly limited to replacement of the book and only at the discretion of the Publisher. The use of “implied warranty” and certain “exclusions” vary from state to state, and might not apply to the purchaser of this product.
BIGDATAª USINGHADOOPª ANDHIVE
NITINKUMAR
MERCURYLEARNINGANDINFORMATION Dulles, Virginia Boston, Massachusetts New Delhi
Copyright ©2021 by MERCURYLEARNINGANDINFORMATIONLLC. All rights reserved.
This publication, portions of it, or any accompanying software may not be reproduced in any way, stored in a retrieval system of any type, or transmitted by any means, media, electronic display or mechanical display, including, but not limited to, photocopy, recording, Internet postings, or scanning, without prior permission in writing from the publisher.
Publisher: David Pallai M L I ERCURY EARNING AND NFORMATION 22841 Quicksilver Drive Dulles, VA 20166 info@merclearning.com www.merclearning.com 8002320223
Nitin Kumar.Big Data Using Hadoop™ and Hive™. ISBN: 9781683926450
The publisher recognizes and respects all marks used by companies, manufacturers, and developers as a means to distinguish their products. All brand names and product names mentioned in this book are trademarks or service marks of their respective companies. Any omission or misuse (of any kind) of service marks or trademarks, etc. is not an attempt to infringe on the property of others.
Library of Congress Control Number: 2021934303
212223321
This book is printed on acidfree paper in the United States of America.
Our titles are available for adoption, license, or bulk purchase by institutions, corporations, etc. For additional information, please contact the Customer Service Dept. at 8002320223(toll free).
All of our titles are available in digital format atacademiccourseware.comother digital vendors. and The sole obligation ofM L Ithe purchaser is to replace the book, to ERCURY EARNING AND NFORMATION based on defective materials or faulty workmanship, but not based on the operation or functionality of the product.
To my wife Sarika, my children, Shaurya and Irima, and to my parents
CONTENTS
PrefaceChapter 1:
Chapter 2:
Chapter 3:
Big Data Big Data Challenges for Organizations How We Are Using Big Data Big Data: An Opportunity Hadoop: A Big Data Solution
Big Data in the Real World
What is Apache Hadoop?
Hadoop History Hadoop Benefits Hadoop’s Ecosystem: Components Hadoop Core Component Architecture Summary The Hadoop Distribution Filesystem HDFS Core Components
HDFS Architecture
Data Replication
Data Locality
Data Storage
Failure Handling on the HDFS
Erasure Coding (EC)
HDFS Disk Balancer
HDFS Federation
xiii 1 2 2 2 3 3 5 5 6 7 10 11 13 14 15 17 18 20 22 25 27 29
viiiContents
Chapter 4:
Chapter 5:
Chapter 6:
HDFS Architecture and Its Challenges
Hadoop Federation: A Rescue
Benefits of the HDFS Federation
HDFS Processes: Read and Write
Failure Handling During Read and Write
Getting Started with Hadoop
Hadoop Configuration
CommandLine Interface
Generic Filesystem CLI Command
Distributed Copy (distcp)
Hadoop’s Other User Commands
HDFS Permissions
HDFS Quotas Guide
HDFS ShortCircuit Local Reads
Offline Edits Viewer Guide
Offline Image Viewer Guide
Interfaces to Access HDFS Files
WebHDFS REST API FileSystem URIs Error Responses Authentication Java FileSystem API URI and Path FSDataInputStream FSDataOutputStream FileStatus Directories Delete Files C API libhdfs Yet Another Resource Negotiator YARN Architecture YARN Process Flow
29 30 31 32 34 35 40 41 42 45 46 47 48 50 50 51 53 53 54 57 58 59 59 61 62 62 63 63 63 65 66 68
Chapter 7:
Chapter 8:
YARN Failures
YARN High Availability
YARN Schedulers
The Fair Scheduler
The Capacity Scheduler
The YARN Timeline Server Application Timeline Server (ATS) ATS Data Model Structure ATS V2 YARN Federation MapReduce MapReduce Process Key Features Different Phases in the MapReduce Process MapReduce Architecture
MapReduce Sample Program
MapReduce Composite Key Operation
Mapper Program MapReduce Configuration Hive Hive History Hive Query Data Storage Data Model Complex Data Types Hive DDL (Data Definition Language) Tables View Partition Bucketing Hive Architecture Serialization/Deserialization (SerDe) Metastore
Contentsix
70 70 72 73 74 76 77 77 78 80 83 83 85 86 89 92 94 96 98 103 105 106 106 107 109 109 112 112 113 114 116 118 118
xContents
Chapter 9:
Query Compiler
HiveServer2 Getting Started with Hive Hive Setup Hive Configuration Settings Loading and Inserting Data into Tables Insert from a Select Query Load Table Data into File Create and Load Data into a Table Hive Transactions Enable Transactions Insert Values Update Delete Merge Locks Hive Select Query Select Basic Query Hive QL File Hive Select on Complex Datatypes Order By and Sort By Distribute By and Cluster By Group By and Having Builtin Aggregate Functions Enhanced Aggregation TableGenerating Functions BuiltIn Utility Functions Collection Functions
Date Functions
Conditional Functions
String Functions
Hive Query LanguageJoin
119 119 121 121 124 127 127 129 129 129 130 131 131 132 132 133 134 135 136 136 137 138 138 139 140 141 143 144 144 146 147 149
Voir icon more
Alternate Text