tmalaska

Ted Malaska tmalaska

297 followers · 1 following

Cloudera

Achievements

hadcom.utils Public

Advanced common functionality for hadoop

Java 6 6 Apache License 2.0 Updated Jul 1, 2022
Taxi360 Public

Simple Example of HBase, SolR, and Kudu for Entity 360 using NY taxi data

Scala 6 5 Apache License 2.0 Updated Jul 1, 2022
AppTrans Public

Examples for training

Scala 1 3 Apache License 2.0 Updated Jul 1, 2022
IngestProcessStoreInNRT Public

This is a demo/training application. Used to show how easy it is to do operations like ingestion, aggregation, and change data capture. Using tools like Kafka, Spark Streaming, Flume, Kudu, SolR, H…

Scala 1 3 Apache License 2.0 Updated Jul 1, 2022
SparkUnitTestingExamples Public

This project is a collection of Spark Unit Tests Examples to help new Spark users have good examples on how to unit start their code for Spark Core, Spark SQL, and Spark Streaming

Scala 36 27 Apache License 2.0 Updated Sep 30, 2020
CopybookInputFormat Public

Using JRecord to build a mapred and mapreduce inputformat for HDFS, MAPREDUCE, PIG, HIVE, Spark, ...

Java 18 16 Apache License 2.0 Updated Dec 7, 2017
Spark.TableStatsExample Public

Simple Spark example of generating table stats for use of data quality checks

Scala 28 30 Apache License 2.0 Updated Apr 28, 2017
kairosdb Public
Forked from kairosdb/kairosdb

Fast scalable time series database

Java Apache License 2.0 Updated Jan 22, 2017
spark.mergesort.example Public

An example of how to do a merge sort

Scala 1 3 Apache License 2.0 Updated Sep 15, 2016
CleanUpEmptyFilesTool Public

This tool is designed to look through your HDFS folders to ether identify files with no data in them or delete files with no data in them.

Scala 3 6 Apache License 2.0 Updated Aug 31, 2016
node-scale Public

A tool to figure out when to grow or shrink a cluster

Java Apache License 2.0 Updated Jul 12, 2016
SparkOnKudu Public

Based off the design of SparkOnHBase. This Repo will support Spark, Spark Streaming, and Spark SQL integration with Kudu.

Scala 51 45 Apache License 2.0 Updated May 19, 2016
SparkOnHBase Public

Scala 24 17 Apache License 2.0 Updated Feb 10, 2016
EnergyMonitorHBaseExample Public

FooBar

Scala 3 Apache License 2.0 Updated Nov 12, 2015
HBase.MCC Public

HBase.MCC (HBase Multi Cluster Client). The goal is to support aways up solutions with HBase through multiple clusters

Java 14 12 Apache License 2.0 Updated Nov 9, 2015
FunHBaseLoaderExamples Public

Just for Fun do not use in the real world. :)

Java 1 Apache License 2.0 Updated Sep 25, 2015
kite Public
Forked from kite-sdk/kite

Kite SDK

Java Apache License 2.0 Updated Jan 6, 2015
Spark.ProdictBehaviorBasedOnPastActives Public

This is an example of how to do window analysis with Spark

Scala 2 1 Apache License 2.0 Updated Nov 24, 2014
SparkStreaming.Sessionization Public

NRT Sessionization with Spark Streaming landing on HDFS and putting live stats in HBase

Scala 51 42 Apache License 2.0 Updated Oct 31, 2014
Directed.ReBalancing Public

The ability to rebalance on clusters that have HBase by selecting folders to rebalance

Java Apache License 2.0 Updated Oct 8, 2014
SparkStreamingSeqSink Public

Support to write Seq Files with Spark Streaming with similar functionality as Flume HDFS Sink with Seq Files

Scala 1 Apache License 2.0 Updated Sep 21, 2014
flume-ng-kafka-source Public
Forked from frankyaorenjie/flume-ng-kafka-source

Java Apache License 2.0 Updated Sep 5, 2014
spark Public
Forked from apache/spark

Mirror of Apache Spark

Scala Apache License 2.0 Updated Aug 1, 2014
FixedLengthInputFormat Public

This is a FixedLengthInputFormat for Hadoop map reduce.

Java 1 1 Updated Jul 6, 2014
Spark..Unique.Seq.Generator Public

This is an example of how to make Unique Sequences in a distributed way with Spark (No dups, No Skips)

Java 3 Apache License 2.0 Updated Jul 3, 2014
Spark.GraphX.Examples Public

Just some example of using GraphX

Scala 3 2 Apache License 2.0 Updated Jul 1, 2014
Giraph.TreeRooter.Example Public

A simple example of using Giraph to root nodes in a tree

Java 2 Apache License 2.0 Updated Jun 29, 2014
FileIngestor Public

A simple program to put files from a directory into HDFS with the added functionality and defining how that action will happen

Java 3 3 Apache License 2.0 Updated Jun 25, 2014
UnbalancedBucketMergeJoin Public

This will do a Merge Join of absolute Sorted data any number of files of ether side.

Java 1 Apache License 2.0 Updated Jun 17, 2014
HBase-FastTableCopy Public

This will contain implementations that will copy records from a table with less regions then the final table.

Java 1 1 Apache License 2.0 Updated May 28, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ted Malaska tmalaska

Achievements

Achievements

Block or report tmalaska

hadcom.utils Public

Taxi360 Public

AppTrans Public

IngestProcessStoreInNRT Public

SparkUnitTestingExamples Public

CopybookInputFormat Public

Spark.TableStatsExample Public

kairosdb Public

spark.mergesort.example Public

CleanUpEmptyFilesTool Public

node-scale Public

SparkOnKudu Public

SparkOnHBase Public

EnergyMonitorHBaseExample Public

HBase.MCC Public

FunHBaseLoaderExamples Public

kite Public

Spark.ProdictBehaviorBasedOnPastActives Public

SparkStreaming.Sessionization Public

Directed.ReBalancing Public

SparkStreamingSeqSink Public

flume-ng-kafka-source Public

spark Public

FixedLengthInputFormat Public

Spark..Unique.Seq.Generator Public

Spark.GraphX.Examples Public

Giraph.TreeRooter.Example Public

FileIngestor Public

UnbalancedBucketMergeJoin Public

HBase-FastTableCopy Public