I applied in-person. The process took 4 weeks. I interviewed at Freelancer (Katmandú) in May 2023
Interview
The interview process comprises three rounds: an initial HR call to discuss background and fit, followed by a coding round to assess technical skills, and concluding with a machine learning round to evaluate specialized knowledge and problem-solving abilities.
Interview questions [1]
Question 1
What is Random Forest describe its working mechanism?
Check on the basics and other aspects related to Data Science. And their application in a real world scenario and how it was able to solve a problem and add value to the entity in question.
I applied online. The process took 3 weeks. I interviewed at Freelancer (Sídney) in Apr 2019
Interview
Two telephone interviews, coding assignment and half-day in person interview and pen-and-paper exam. After all of this, and taking a day of annual leave for the in-person interview - they said they would call me back only if I was successful and if I didn't hear back from them it meant that I didn't get the job. After the amount of time spent on the interview process, I find this type of behaviour unacceptable.
Interview questions [2]
Question 1
Here is the coding assignment that I received:
## Scenario
Your friend owns a nightclub, and the nightclub is suffering an epidemic of stolen phones. At least one thief has been frequenting her club and stealing her visitors' phones. Her club has a licence scanner at its entrance, that records the name and date-of-birth of everyone who enters the club - so she should have the personal details of the thief or thieves; it's just mixed in with the details of her honest customers. She heard you call yourself a "data scientist", so has asked you to come up with a ranked list of up to 20 suspects to give to the police.
She's given you:
`visitor_log.csv` - details of who visited the club and on what day (those visiting 2AM Tuesday are counted as visiting on Monday).
`theft_log.csv' - a list of days on which thefts were reported to occur (again, thefts after midnight are counted as the previous day - we're being nice to you)
She wants from you:
- A list of ID details for the 20 most suspicious patrons, ranked from most-suspicious to least-suspicious.
- If you think there are fewer than 20 thieves, a list of ID details for everyone that you think is a thief.
## Metadata
We'd like you to spend up to a few hours on this problem. We don't just want to know your final answer and its source code; after you've finished, we'd like to discuss your journey - including any blind alleys you went down, or approaches you thought would be good but didn't know how to implement.
Here is as much as the pen-and-paper test that I remember:
<2 questions on Linux>
Maths and stats
1) 200 people responded to a mail out of 1000 customers. They want to send to another 500, what is the probability that 100 respond?
2) Joint distribution with p(x) > p(y). What is the probability that x > y?
3) Why does milk powder come in a cylinder and not in a box?
Experiment Design
Create two landing pages for a new competition to increase the number of subscribers and want to see which one is successful.
1) How would you test this?
2) How do you work out which landing page is more successful?
3) What’s the point? The competition is already over.
4) How could you improve subscriptions?
Programming
(i) Check to see if a string is a palindrome. How would you do unit testing?
(ii) An array A consists of numbers 1-n, but is missing a single value. Identify which one, using O(n) and only one additional variable.
(iii) Using a random number generator, generate a distribution where one value is weighted more heavily than others. (i.e., if [1,2,3] is the output, 1 is output 80% of the time and [2,3] are output 10% of the time each).
Databases
Generate a database flowchart for a social network. Include users with username, picture, password, their messages, their friends, and any groups they’ve signed up to (with only one admin).
1) Write a query to:
(i) Extract a message list between two people.
(ii) Extract friends of friends of a user.
2) A) How is a left join different from an inner join?
B) How is where different from having?
3) A) Given a database of financial transactions (transaction ID, user ID, amount, date), how do you identify the balance of each user?
B) How do you find the mean of the user balances?
C) And their standard deviation?
Machine Learning
What is a confusion matrix? Give an example.
Build an algorithm to determine spam based on the description somebody has written on the site.
(i) How would you clean the data?
(ii) What other data would be useful to determine this?
(iii) What features would you use?
(iv) What algorithm would you use?
(v) How would you evaluate the algorithm?