FLUME

INTERVIEW QUESTIONS

● Can we load data directly into Hbase?

● How will you create directories in HDFS based on the timestamp present in input file?

● What will happen if no timestamps are present in input file?

● Work flow of flume?

● What are the channel types in Flume?(Memory,JDBC,File channel)Which one is faster memory?

● How will you start a flume agent from Command line?

● What are interceptors in flume?

● We are getting a NumberFormatException when using format escape sequences for date & time(%Y %M %D etc..)

● What is the bridge mechanism used for Multi-hop agent setup in Flume?

● Which is the reliable channel to make sure there is no data loss (JDBC, File, Memory)?

● What is Fan out flow in Flume?

● What are the event serializers available in Flume?

● How do we collect records in JSON format directly through Flume?

● What is the difference between FileSink and File Roll Sink?

● Difference between ASynchHbase Sink and HBase sink types ?

● Can we perform realtime analysis on the data collected by Flume directly ? if yes how?

● If we need to get speed of memory channel and data reliabilty of file channel in a single agent channel, then how can we achieve this?

● What are multiplexing selectors in flume?

● What are replication selectors in flume?

● What is the use of HostInterceptor in flume?

● What is the advantage of UUIDInterceptor in flume?

● In defining type of sources or sinks in flume is it mandatory to provide the full class name?