Hadoop

When you want to measure fractions of a millimeter, you get a micrometer. When you want to measure centimeters, you get a ruler. When you want to measure kilometers, you might use a laser beam. The abstract task is the same in all cases, but the tools differ significantly based on the size of the measurement.

Likewise, there are some computations that can be done quickly on data structures that fit into memory. Some can't fit into memory, but will fit on the direct-attached disk of a single computer. But when you've got many terabytes or even petabytes of data, you need tooling adapted to the scale of the task. Enter Hadoop.

Hadoop is a widely-used open source framework for storing massive data sets in distributed clusters of computers and efficiently distributing computational tasks around the cluster. Come learn about the Hadoop File System (HDFS), the MapReduce pattern and its implementation, and the broad ecosystem of tools, products, and companies that have grown up around this ground-breaking project.


About Tim Berglund

Tim Berglund

Tim is a full-stack generalist and passionate teacher who loves coding, presenting, and working with people. He believes the best developer is one who is well-informed of specifics and can also make deep connections between software development and the broader world. He has recently been exploring non-relational data stores, continuous deployment, and how software architecture should resemble an ant colony.

His firm, the August Technology Group, helps clients with product development, technology consulting, and technology upgrade projects atop the JVM. The August Group's technology preferences reflect the generalist sensibilities of its founder, and its development practices are always lightweight, self-improving, and humanizing by design.

Tim is a speaker internationally and on the No Fluff Just Stuff tour in the United States, and is co-president of the Denver Open Source User Group in the Denver area, co-author of the DZone Clojure RefCard, co-presenter of the best-selling O'Reilly Git Master Class, co-author of Building and Testing with Gradle, and a member of the O'Reilly Expert Network.

He lives in Littleton, CO with the wife of his youth and their three children.

More About Tim »