Big Data Engineer Interview Questions

1,228 big data engineer interview questions shared by candidates

Project Day: Message handling server • Development environment: Eclipse • Help tools: Internet • Time frame: as much as you need (here at NICE). • Target products: o Message Client: Requirements: Web Application that enables user to send several types of messages: integers, fractions, strings, etc... (The data in the message should be in numeric form. E.G 6, 1/7, "6") and see the prints from the server. o Message Server: Requirements: • The solution should implement 2 message servers which can handle several types of messages that were sent from client web application • Server I: • The server should be able to receive several messages from different clients simultaneously • there should be a message handler which handle the messages and sends the messages sequentially to the second server • Server 2: • Should receive each message, print it in its original form, and perform it summation of all the data. o Documentation: • Design document: In this document you should describe the way you are going to implement your application. You should describe there as much as you can (Architecture, Data structure, Algorithms, components that you are going to use). • Unit test document: In this document you should describe the scenarios that you are going to perform in order to debug your application. After all these scenarios pass with sent expected result there shouldn't be any bugs in the application. • Your goals o Design the class diagram o Review the proposed solution o Implement the server and client applications • General directions: o The first priority is a complete solution which implements all specified requirements with as few bugs as possible o Industrial code standards should be used both for the main issues (performance, readability, etc.) and for the cosmetic issues (clean code, formatted code, documentation) o After completing each step, it should be presented before moving to the next step Good Luck
avatar

Software Engineer - Big Data Team

Interviewed at NiCE

3.9
Apr 14, 2015

Project Day: Message handling server • Development environment: Eclipse • Help tools: Internet • Time frame: as much as you need (here at NICE). • Target products: o Message Client: Requirements: Web Application that enables user to send several types of messages: integers, fractions, strings, etc... (The data in the message should be in numeric form. E.G 6, 1/7, "6") and see the prints from the server. o Message Server: Requirements: • The solution should implement 2 message servers which can handle several types of messages that were sent from client web application • Server I: • The server should be able to receive several messages from different clients simultaneously • there should be a message handler which handle the messages and sends the messages sequentially to the second server • Server 2: • Should receive each message, print it in its original form, and perform it summation of all the data. o Documentation: • Design document: In this document you should describe the way you are going to implement your application. You should describe there as much as you can (Architecture, Data structure, Algorithms, components that you are going to use). • Unit test document: In this document you should describe the scenarios that you are going to perform in order to debug your application. After all these scenarios pass with sent expected result there shouldn't be any bugs in the application. • Your goals o Design the class diagram o Review the proposed solution o Implement the server and client applications • General directions: o The first priority is a complete solution which implements all specified requirements with as few bugs as possible o Industrial code standards should be used both for the main issues (performance, readability, etc.) and for the cosmetic issues (clean code, formatted code, documentation) o After completing each step, it should be presented before moving to the next step Good Luck

L1 -Techincal Inyroduce yourself What is your project What are your source data types ? -csv/RDBMS how you get it? How bigger was the client cluster? What was data size? Load was daily , weekly or month? Why client selected hadoop rather than RDBMS? Which tool for workfow? What is staging in spark? What is RDD? What is intention behing lazy evaluation? What is intention behind keeping RDD immutable/unable to update? You have written multiple tranformations on your RDD but still you have not fired any action. How your spark server WEB UI will look like? Suppose you fired action on on RDD what exactly happens internally in spark ?(Here I told about it goes backword 1 by 1 to created required RDD using lineage graph in backword direction and first RDD is calculated and again return back to action) Which are the transformation in spark? I have given you an RDD . how will you convert it to paired RDD uisng its first element as key? ans- RDD2=RDD1.map(lambda x:(x[1], x)) What is difference between hadoop 2X and 1X ? What is HA concept? What if Name node failed? What to do and who was doing in your project? What is heartbeats concept? I have file 500 MB on hadoop 2x .how much block and replicas will be there ? I have a file home_id product meter h1 p1 20 h1 p2 30 H2 p2 23 I want to create partitions with the key home id.How will do it on local file system without suing SPARK, HIVE ,MAP reduce. Use simle programing language like java/python. Later how will you do it in hive and spark? 21. I have an 3x3 ARRAY which is sorted 1 3 5 7 8 9 11 15 18 Write a program so that if use passed any element from terminal, it will return its exact position in array. (i did as below ) a=int[3][3] a=[(1,3,5),(7,8,9),(11,15,18)] x=int(std.input()) --user input For i in 1 to 3 For j 1 to 3 If x ==a[i][j] Then print(‘location of x in %i %j’,i,j) L2 : technical 1.there is file Name id Ajay 1 Ram 2 Ajay 3 Ram 4 Jack 6 Devid 7 ID is unique and Name might be repeatble. Write program so that user will enter name ‘ajay’ then program will return list of IDs -[1,3] Input Ram : output [2,4]
avatar

Big Data Engineer

Interviewed at Persistent Systems

4.2
Oct 28, 2018

L1 -Techincal Inyroduce yourself What is your project What are your source data types ? -csv/RDBMS how you get it? How bigger was the client cluster? What was data size? Load was daily , weekly or month? Why client selected hadoop rather than RDBMS? Which tool for workfow? What is staging in spark? What is RDD? What is intention behing lazy evaluation? What is intention behind keeping RDD immutable/unable to update? You have written multiple tranformations on your RDD but still you have not fired any action. How your spark server WEB UI will look like? Suppose you fired action on on RDD what exactly happens internally in spark ?(Here I told about it goes backword 1 by 1 to created required RDD using lineage graph in backword direction and first RDD is calculated and again return back to action) Which are the transformation in spark? I have given you an RDD . how will you convert it to paired RDD uisng its first element as key? ans- RDD2=RDD1.map(lambda x:(x[1], x)) What is difference between hadoop 2X and 1X ? What is HA concept? What if Name node failed? What to do and who was doing in your project? What is heartbeats concept? I have file 500 MB on hadoop 2x .how much block and replicas will be there ? I have a file home_id product meter h1 p1 20 h1 p2 30 H2 p2 23 I want to create partitions with the key home id.How will do it on local file system without suing SPARK, HIVE ,MAP reduce. Use simle programing language like java/python. Later how will you do it in hive and spark? 21. I have an 3x3 ARRAY which is sorted 1 3 5 7 8 9 11 15 18 Write a program so that if use passed any element from terminal, it will return its exact position in array. (i did as below ) a=int[3][3] a=[(1,3,5),(7,8,9),(11,15,18)] x=int(std.input()) --user input For i in 1 to 3 For j 1 to 3 If x ==a[i][j] Then print(‘location of x in %i %j’,i,j) L2 : technical 1.there is file Name id Ajay 1 Ram 2 Ajay 3 Ram 4 Jack 6 Devid 7 ID is unique and Name might be repeatble. Write program so that user will enter name ‘ajay’ then program will return list of IDs -[1,3] Input Ram : output [2,4]

Viewing 781 - 790 interview questions

See Interview Questions for Similar Jobs

Glassdoor has 1,228 interview questions and reports from Big data engineer interviews. Prepare for your interview. Get hired. Love your job.