Amazon Interview Question for Data Engineers


Country: India
Interview Type: In-Person




Comment hidden because of low score. Click to expand.
0
of 0 vote

Couple of doubts
1. Is date range non overlapping and sorted?
2. Is visitor/date file has record sorted by date?

- Progu November 02, 2018 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

Let me do it using mapreduce instead of spark, I don't know much of spark. I would assume the concept is similar though.

file1: <startDate endDate>
1 5
8 20
file2: <date visitor>
2 5
3 8
10 120

So answer should be 8-20 since in that date range we have max visitor.

Assumption : since file1 just has ranges it should be a small file and can be loaded into the distributed cache of hadoop.

Now execute map code for file2, the code should do following :
1) read file2 and for each date , see which range it belongs to from the distributed cache and increment the counter for that time range.
eg
for 2 increment counter for 1_5
for 3 increment counter for 1_5
for 10 increment counter for 8_20
3) output <range , counter> as map output
4) In reduce add all the counters for every range.

Also - we need to add total order sorting so that overall output of all reducers are sorted.

- kabs November 24, 2018 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 vote

(spark sql):

%sql

drop table dateRange
create table dateRange(startdate int, enddate int)
insert dateRange values (1,5),(8,20)

drop table dateVisitor
create table dateVisitor(date int , visitors int)
insert dateVisitor values (2,5),(3,8),(10,120)

select * from daterange;

select * from datevisitor


select top 1 sum(visitors) as totalvisitors, startdate,enddate from daterange d join datevisitor v on v.date between d.startdate and d.enddate
group by  startdate,enddate order by totalvisitors desc

select  distinct top 1 startdate,enddate,sum(visitors) over (partition by startdate,enddate ) as totalvisitors from daterange d join datevisitor v on v.date between d.startdate and d.enddate
order by 3 desc

- Vijay Panchal July 01, 2019 | Flag Reply
Comment hidden because of low score. Click to expand.
0
of 0 votes

Hi Vijay,

Please, I would like you to explain (insert dateRange values (1,5),(8,20)) and (insert dateVisitor values (2,5),(3,8),(10,120))

Also, are you on linkedIn?

I hope to hear from you soon

- Aaron April 24, 2020 | Flag
Comment hidden because of low score. Click to expand.
0
of 0 vote

drop table dateRange
create table dateRange(startdate int, enddate int)
insert dateRange values (1,5),(8,20)

drop table dateVisitor
create table dateVisitor(date int , visitors int)
insert dateVisitor values (2,5),(3,8),(10,120)

select * from daterange;

select * from datevisitor


select top 1 sum(visitors) as totalvisitors, startdate,enddate from daterange d join datevisitor v on v.date between d.startdate and d.enddate
group by startdate,enddate order by totalvisitors desc

select distinct top 1 startdate,enddate,sum(visitors) over (partition by startdate,enddate ) as totalvisitors from daterange d join datevisitor v on v.date between d.startdate and d.enddate
order by 3 desc

- vijay panchal July 01, 2019 | Flag Reply


Add a Comment
Name:

Writing Code? Surround your code with {{{ and }}} to preserve whitespace.

Books

is a comprehensive book on getting a job at a top tech company, while focuses on dev interviews and does this for PMs.

Learn More

Videos

CareerCup's interview videos give you a real-life look at technical interviews. In these unscripted videos, watch how other candidates handle tough questions and how the interviewer thinks about their performance.

Learn More

Resume Review

Most engineers make critical mistakes on their resumes -- we can fix your resume with our custom resume review service. And, we use fellow engineers as our resume reviewers, so you can be sure that we "get" what you're saying.

Learn More

Mock Interviews

Our Mock Interviews will be conducted "in character" just like a real interview, and can focus on whatever topics you want. All our interviewers have worked for Microsoft, Google or Amazon, you know you'll get a true-to-life experience.

Learn More