● Can we load data directly into Hbase?
● How will you create directories in HDFS based on the timestamp present in input file?
● What will happen if no timestamps are present in input file?
● Work flow of flume?
● What are the channel types in Flume?(Memory,JDBC,File channel)Which one is faster memory?
● How will you start a flume agent from Command line?
● What are interceptors in flume?
● We are getting a NumberFormatException when using format escape sequences for date & time(%Y %M %D etc..)
● What is the bridge mechanism used for Multi-hop agent setup in Flume?
● Which is the reliable channel to make sure there is no data loss (JDBC, File, Memory)?
● What is Fan out flow in Flume?
● What are the event serializers available in Flume?
● How do we collect records in JSON format directly through Flume?
● What is the difference between FileSink and File Roll Sink?
● Difference between ASynchHbase Sink and HBase sink types ?
● Can we perform realtime analysis on the data collected by Flume directly ? if yes how?
● If we need to get speed of memory channel and data reliabilty of file channel in a single agent channel, then how can we achieve this?
● What are multiplexing selectors in flume?
● What are replication selectors in flume?
● What is the use of HostInterceptor in flume?
● What is the advantage of UUIDInterceptor in flume?
● In defining type of sources or sinks in flume is it mandatory to provide the full class name?