mrjob

Here are 61 public repositories matching this topic...

groda / big_data

Big Data essentials: Hadoop, MapReduce, Spark. Explore tutorials and demos in Jupyter notebooks—most are self-contained and live, ready to run with a click.

docker big-data spark apache-spark hadoop bigdata jupyter-notebook pyspark hadoop-cluster mapreduce gutenberg-ebooks hadoop-mapreduce spark-sql mrjob bigtop hadoop-hdfs testdfsio mapreduce-bash apache-sedona

Updated Mar 8, 2026
Jupyter Notebook

jehiah / gomrjob

Star

gomrjob - a Go Framework for Hadoop Map Reduce Jobs

go hadoop mapreduce mrjob dataproc

Updated Aug 12, 2025
Go

Tarasa24 / PWA-Store

Star

The largest collection of publicly accessible Progressive Web Apps*

emr golang crawler pwa linode postgresql mrjob commoncrawl puppeteer

Updated Sep 2, 2022
HTML

MHassaanButt / Flight-Delays-Prediction

Star

In this project, I used Decision Tree Learning Model as the main algorithm to build the model. Due to the big amount of flight data, we implement the project using MRJob, PySpark and Spark's MLlib then compare the performance and accuracy of those implementations.

hadoop pyspark decision-tree mrjob spark-mllib

Updated Dec 21, 2021
Jupyter Notebook

HearthSim / articles

Star

Analysis of Hearthstone replays

emr hearthstone replays mrjob

Updated May 18, 2017
Jupyter Notebook

nanfengpo / hadoop-with-python-code

Star

Exercises and examples developed for the Hadoop with Python tutorial

hadoop mrjob

Updated Apr 13, 2015
Jupyter Notebook

thedatasociety / lab-hadoop

Star

hive hadoop hbase flume sqoop hadoop-mapreduce hadoop-streaming mrjob hadoop-hdfs hadoop-yarn

Updated Jan 19, 2024
PLpgSQL

tugrulhkarabulut / hadoop-movie-rating-prediction

Star

Movie rating prediction application

flask machine-learning natural-language-processing hadoop hadoop-cluster hadoop-mapreduce mrjob

Updated Jun 30, 2021
CSS

shinde-chandrakant / BigData-Ops-on-TLC-Yellow-Taxi

Star

Analysed New York City's Yellow taxi data set with Big Data tools such as Hadoop, HBase, Sqoop, MapReduce and AWS Cloud Infrastructure.

aws hadoop aws-s3 bigdata hbase aws-emr mapreduce aws-rds data-modeling sqoop mrjob big-data-analytics

Updated Sep 18, 2023
Python

burhanahmed1 / Big-Data-Analytics

Star

Practice tasks in Python programming language using Hadoop, MRJob, PySpark for Big Data Analytics.

python spark apache-spark hadoop jupyter-notebook pyspark sparksql hadoop-mapreduce spark-sql mrjob

Updated Jun 28, 2024
Jupyter Notebook

MadimetjaMadix / ELEN4020A_Lab3

Star

Using MapReduce Framework

mrs python3 mapreduce mrjob

Updated Apr 16, 2019
TeX

Mariona-FT / Information-Retrieval-REIN

Star

RECUPERACIÓ DE LA INFORMACIÓ Curs 2023-24 EPSEVG

elasticsearch information-retrieval indexing upc tokenization mrjob rastreator epsevg

Updated Apr 13, 2024
Jupyter Notebook

devanshk01 / hadoop-on-windows11

Star

End-to-end guide and code for installing, configuring, and running Apache Hadoop 3.2.4 on Windows 11 — includes configuration templates, sample data, troubleshooting notes, and example Python MapReduce jobs using mrjob.

setup tutorial hadoop mapreduce mrjob windows-11

Updated Aug 12, 2025

jonathanAmancioSales / BigData_AWS_EMR_MRJob_DIO

Star

Projeto de processamento distribuído de dados utilizando Python, MRJob e AWS EMR

aws cloud aws-s3 s3 s3-bucket aws-emr aws-ec2 mrjob

Updated Aug 8, 2021
Python

ARomoH / Basic-Sentiment-Analysis-MrJob-Twitter-

Star

Project developed to make an sentiment analysis using dictionary implemented with MrJob applying a map-reduce model. It can be executed locally or in HDFS enviroments (such as Hadoop or AWS)

hadoop sentiment-analysis map-reduce aws-ec2 mrjob twiiter hdfs-enviroments

Updated Sep 18, 2017
Python

JaredP94 / MapReduce-Matrix-Multiplication

Star

python mapreduce mrjob

Updated Apr 9, 2018
Python

aneessaheba / hadoop-news-analytics

Star

Distributed word frequency analysis on 5,000 HuffPost news headlines using Apache Hadoop MapReduce and mrjob. Single-node cluster on Docker with HDFS and YARN configured from scratch. Top 50 keywords extracted via a 2-step MapReduce pipeline with NLTK stopword filtering.

Updated Mar 7, 2026
Python

esakik / data-engineering-essentials

Star

Samples related to data engineering, e.g. spark, embulk, airflow, etc.

apache-spark protocol-buffers amazon-emr data-engineering digdag fluentd apache-beam embulk apache-avro mrjob apache-airflow cloud-dataflow apache-hadoop cloud-dataproc

Updated Dec 8, 2022
Python

mrjuice01 / SharpGenTools

Star

Accurate and high performance C++ interop code generator for C#.

css csharp mrjob

Updated Nov 21, 2023
C

matchilling / kata-mapreduce

Star

kata aws-emr mapreduce mrjob

Updated Jan 11, 2018
Jupyter Notebook

Improve this page

Add a description, image, and links to the mrjob topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the mrjob topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mrjob

Here are 61 public repositories matching this topic...

groda / big_data

jehiah / gomrjob

Tarasa24 / PWA-Store

MHassaanButt / Flight-Delays-Prediction

HearthSim / articles

nanfengpo / hadoop-with-python-code

thedatasociety / lab-hadoop

tugrulhkarabulut / hadoop-movie-rating-prediction

shinde-chandrakant / BigData-Ops-on-TLC-Yellow-Taxi

burhanahmed1 / Big-Data-Analytics

MadimetjaMadix / ELEN4020A_Lab3

Mariona-FT / Information-Retrieval-REIN

devanshk01 / hadoop-on-windows11

jonathanAmancioSales / BigData_AWS_EMR_MRJob_DIO

ARomoH / Basic-Sentiment-Analysis-MrJob-Twitter-

JaredP94 / MapReduce-Matrix-Multiplication

aneessaheba / hadoop-news-analytics

esakik / data-engineering-essentials

mrjuice01 / SharpGenTools

matchilling / kata-mapreduce

Improve this page

Add this topic to your repo