Data Science at the Command Line

Facing the Future with Time-Tested Tools

Author: Jeroen Janssens

Publisher: "O'Reilly Media, Inc."

ISBN: 1491947802

Category: Computers

Page: 212

View: 2255

DOWNLOAD NOW »

This hands-on guide demonstrates how the flexibility of the command line can help you become a more efficient and productive data scientist. You’ll learn how to combine small, yet powerful, command-line tools to quickly obtain, scrub, explore, and model your data. To get you started—whether you’re on Windows, OS X, or Linux—author Jeroen Janssens introduces the Data Science Toolbox, an easy-to-install virtual environment packed with over 80 command-line tools. Discover why the command line is an agile, scalable, and extensible technology. Even if you’re already comfortable processing data with, say, Python or R, you’ll greatly improve your data science workflow by also leveraging the power of the command line. Obtain data from websites, APIs, databases, and spreadsheets Perform scrub operations on plain text, CSV, HTML/XML, and JSON Explore data, compute descriptive statistics, and create visualizations Manage your data science workflow using Drake Create reusable tools from one-liners and existing Python or R code Parallelize and distribute data-intensive pipelines using GNU Parallel Model data with dimensionality reduction, clustering, regression, and classification algorithms
Release

Doing Digital Humanities

Practice, Training, Research

Author: Constance Crompton,Richard J Lane,Ray Siemens

Publisher: Taylor & Francis

ISBN: 1317481135

Category: Literary Criticism

Page: 408

View: 8367

DOWNLOAD NOW »

Digital Humanities is rapidly evolving as a significant approach to/method of teaching, learning and research across the humanities. This is a first-stop book for people interested in getting to grips with digital humanities whether as a student or a professor. The book offers a practical guide to the area as well as offering reflection on the main objectives and processes, including: Accessible introductions of the basics of Digital Humanities through to more complex ideas A wide range of topics from feminist Digital Humanities, digital journal publishing, gaming, text encoding, project management and pedagogy Contextualised case studies Resources for starting Digital Humanities such as links, training materials and exercises Doing Digital Humanities looks at the practicalities of how digital research and creation can enhance both learning and research and offers an approachable way into this complex, yet essential topic.
Release

The Data Bonanza

Improving Knowledge Discovery in Science, Engineering, and Business

Author: Malcolm Atkinson,Rob Baxter,Peter Brezany,Oscar Corcho,Michelle Galea,Mark Parsons,David Snelling,Jano van Hemert

Publisher: John Wiley & Sons

ISBN: 1118540301

Category: Computers

Page: 576

View: 7477

DOWNLOAD NOW »

Complete guidance for mastering the tools and techniques ofthe digital revolution With the digital revolution opening up tremendous opportunitiesin many fields, there is a growing need for skilled professionalswho can develop data-intensive systems and extract information andknowledge from them. This book frames for the first time a newsystematic approach for tackling the challenges of data-intensivecomputing, providing decision makers and technical experts alikewith practical tools for dealing with our exploding datacollections. Emphasizing data-intensive thinking and interdisciplinarycollaboration, The Data Bonanza: Improving Knowledge Discoveryin Science, Engineering, and Business examines the essentialcomponents of knowledge discovery, surveys many of the currentresearch efforts worldwide, and points to new areas for innovation.Complete with a wealth of examples and DISPEL-based methodsdemonstrating how to gain more from data in real-world systems, thebook: Outlines the concepts and rationale for implementingdata-intensive computing in organizations Covers from the ground up problem-solving strategies for dataanalysis in a data-rich world Introduces techniques for data-intensive engineering using theData-Intensive Systems Process Engineering Language DISPEL Features in-depth case studies in customer relations,environmental hazards, seismology, and more Showcases successful applications in areas ranging fromastronomy and the humanities to transport engineering Includes sample program snippets throughout the text as well asadditional materials on a companion website The Data Bonanza is a must-have guide for informationstrategists, data analysts, and engineers in business, research,and government, and for anyone wishing to be on the cutting edge ofdata mining, machine learning, databases, distributed systems, orlarge-scale computing.
Release