Shared in DescriptionQuestion1) If we have input.csv, we need to find the output. File and desired output are given below. username, mobile user1,999999991:888888882 user3,777777771 user2,777777234:823232351 user5,734452343:943433434:834323434 user1,999999991:9994433777 output user1:3 user2:2 user3:1 Question2) How can we read a csv file into dataframe Question3) Option to modify the encoding while reading a file in Scala Question 4) Optin to modify the timestamp while reading a file Question 5) How to introduce separators like "," while reading a file Question 6) How to infer Schema =============================== Question 7) How have below 2 tables, we need to find out users who visited a bank but didn't make any transactions? -- Visits table: -- +---------+------------+ -- | user_id | visit_date | -- +---------+------------+ -- | 1 | 2020-01-01 | -- | 2 | 2020-01-02 | -- | 12 | 2020-01-01 | -- | 19 | 2020-01-03 | -- | 1 | 2020-01-02 | -- | 2 | 2020-01-03 | -- | 1 | 2020-01-04 | -- | 7 | 2020-01-11 | -- | 9 | 2020-01-25 | -- | 8 | 2020-01-28 | -- +---------+------------+ -- Transactions table: -- +---------+------------------+--------+ -- | user_id | transaction_date | amount | -- +---------+------------------+--------+ -- | 1 | 2020-01-02 | 120 | -- | 2 | 2020-01-03 | 22 | -- | 7 | 2020-01-11 | 232 | -- | 1 | 2020-01-04 | 7 | -- | 9 | 2020-01-25 | 33 | -- | 9 | 2020-01-25 | 66 | -- | 8 | 2020-01-28 | 1 | -- | 9 | 2020-01-25 | 99 | -- +---------+------------------+--------+
Data Engineer Interview Questions
21,066 data engineer interview questions shared by candidates
Explain the join and union and describe the differences between them.
Finding Top N within nested Categories
Comment vous projetez vous ?
In SQL, is there a difference between "sum(colA)+sum(colB)" and "sum(colA+colB)"? If so, what is it?
Interview is based on my previous experience and technical question on Unix, IBM Datastage , DB2 , Sybase and Oracle.
Print out a grade-school multiplication table up to 12x12 (Should be aligned)
Regex, Binary
Prime Factorization... They expected me to remember the Sieve of Eratosthenes off the top of my head... That has absolutely nothing to do with the job role they described and I highly doubt they use it in their code base.
What have you done so far as a Data Engineer? How many years of experience? How much data you processed? Do you have experience with AWS tools and technologies like DynamoDB? How can you manage MongoDB in production? Why you want to leave current job and country?
Viewing 1301 - 1310 interview questions