Explain how you would design an ETL pipeline to move and transform data from multiple sources.
Data Engineer Interview Questions
21,097 data engineer interview questions shared by candidates
- What is the difference between shallow and deep copy in Python?
General purpose Azure Storage types ?
What is your knowledge on cloud and why do you want to work in this area
I was asked the difference between DBMS and RDBMS. Also,when denormalised forms can be better than the normalised forms.
Suppose I have records like this: ("a-b", "data1", 1) ("a-c", "data2", 1) ("a-b", "data3", 1) How can I group and sum, such that I have the following results when the input is a DataStream? ("a-b", ["data1", "data3"], 2) ("a-c", ["data2"], 1)
First Round Q1 Delete duplicates from a table. Q2 Find duplicate rows from the table. Q3 Repartition vs Coalesce Q4 How stages are created in Spark ? Q5 Word count program with Spark Q6 Fibonnaci Sequence using Python Q7 What are generators ? Q8 Spark Architecture Q9 Questions related to project Techno managerial Round Q1 Discussion about the project and experience. Q2 Query to create a table and partition the table in HIVE ? Q3 Directory structure for partitioned table Q4 If you add a new directory with correct schema to hdfs will data be shown in HQL ? Q5 Find the max temperature for a given date range ?
What are the Complexities in previous Project
Find middle element of linked list in one iteration.
Scenario Based Questions for both Coding and system design
Viewing 1851 - 1860 interview questions