Skip to content

Latest commit

 

History

History
21 lines (17 loc) · 1.09 KB

File metadata and controls

21 lines (17 loc) · 1.09 KB

Built In Methods for Streaming Import

The StreamingContext has many built in methods for importing data to streaming. socketTextStream was introduced in the previous chapter, and textFileStream is introduced here. The textFileStream method monitors any Hadoop-compatible filesystem directory for new files and when it detects a new file - reads it into Spark Streaming. Just replace the call to socketTextStream with textFileStream, and pass in the directory to monitor for log files.

// This methods monitors a directory for new files
// to read in for streaming.
JavaDStream<String> logData = jssc.textFileStream(directory);

Try running LogAnalyzerStreamingImportDirectory.java by specifying a directory. You'll also need to drop or copy some new log files into that directory while the program is running to see the calculated values update.

There are more built-in input methods for streaming - check them out in the reference API documents for the StreamingContext.