2020-11-26
What It Takes to Be a Data Engineer: Working with Mercari US’s Numbers in a Diverse Team Across Borders #WorkWithMercari
When you look at job listings, do you ever feel like you want more information than just the requirements listed in the description?
In the #WorkWithMercari series, we interview teams who are currently hiring, and dig deep to find out just what kind of person they’re looking for!
Today, we’re introducing Mercari US, one of the three pillars of the Mercari Group. The role: a data engineer in US@Tokyo, which helps with developing Mercari US from the Tokyo office.
Our Mercari US office is located in Palo Alto. However, our data engineers collaborating on development work not only from Palo Alto, but also from Boston and Japan.
In this article, we sit down with current Mercari US data engineers @ryota, @hatone, and @sudarsan as they share their thoughts about what kind of person they want to work with. @kumon, a manager in US@Tokyo, moderates.
※Only when shooting, remove the mask.
この記事に登場する人
-
Ryota Katoh(@ryota)In 2015, joined Origami, Inc., where he launched and became manager of the data engineering team. The company became part of the Mercari Group in March 2020. He then transferred to US@Tokyo in June 2020 and has been working there since. -
Takako Ohshima(@hatone)Joined Mercari US in February 2017. After working in the CXI Team, which uses technology to solve CS (customer support) issues such as decreasing fraudulent listings and optimizing user communication to improve CX, she now works in the Machine Learning and Data Engineering Team. -
Sudarsanan Janakiraman(@sudarsan)Has a wide range of experience constructing distributed systems and data pipelines for everything from startups to major companies. He started working at Mercari US in the Palo Alto office in 2019, and has been in charge of developing knowledge graph systems and event streaming systems. He enjoys hiking, cycling, and baking in his spare time. -
Takuma Yamaguchi(@kumon)He has been engaged in the study of pattern recognition and machine learning since he was a student, and has a doctorate in engineering. He has also conducted research in computer vision and operation research, as well as developing and operating data analysis systems. He joined Mercari in 2016. He is currently the engineering manager for the Data Engineering Team and AI/Machine Learning Team.
What we look for in a data engineer for Mercari US
@kumon: Our US@Tokyo team is hiring a new data engineer, so I’d like to talk to our current data engineers in Palo Alto and Japan and get their views. @hatone actually works in Palo Alto, but I asked her to join me here today since she’s currently in Japan.
Let’s get started then, and go over what we do and the skill set we’re looking for!
<What we do>
・ Design, build and maintain ETL pipelines for data analysis and production use
・ Operate and optimize our data analysis system
・ Combine data collected from a variety of sources to accelerate machine learning efforts
・ Collaborate with teams of data analysts, product managers and engineers in Japan and the US to accelerate data-driven business growth
・ Increase productivity organization-wide by standardizing data engineering tools and processes
<Ideal skill set>
・ Over 5 years of experience in software engineering, and at least 3 years in developing applications in Python, PHP, or Golang
・ Experience working on end-to-end development of backend systems
・ Advanced knowledge of databases, real-time and batch data pipelines, SQL, and data analysis
・ Fundamental knowledge and troubleshooting skills in security, Linux, logging, and system operations
・ Good communication and interpersonal skills, with the ability to collaborate across teams
Taking responsibility for all the data and thinking about how to lead
@kumon: We currently have four data engineers working on Mercari US, scattered throughout Palo Alto, Japan, and Boston. While we work from different places, we deal with all the data that concerns Mercari US. What role do you think someone in this position needs to play?
@ryota: Our mission is to create data pipelines and connect the data to the app. I think that as a data engineer, we need someone who can think of things like how to improve our services with the data we have and what we need to do to make our data pipelines easier for people from other teams to use.
@ryota
@sudarsan: It’s important for people who are unfamiliar with data pipelines to be able to use them easily. The data we work with not only stays within the development team, but also goes out to people in accounting or IR. There tend to be differences in how data is perceived between teams, and these differences can cause a lot of confusion and miscommunication, which ends up costing more on both ends. Moreover, we correspond a lot with Mercari JP as we have a lot of similarities in accounting, but we struggle with the language barrier since not all of their members understand English. It’s important to be able to overcome those barriers and see eye-to-eye.
@hatone: I think the most crucial part of being a data engineer at Mercari US is taking responsibility for all the data and thinking about how to lead. Because we deal with a lot of important information, it requires skills to both apply the data to our services and manage it appropriately at the same time.
@kumon: That’s true. We have access to all of the data from Mercari US, so we have to be able to understand what it means and be very conscious about quality and security.
@sudarsan: As engineering skills go, it requires an understanding of algorithms and data structures, and experience using distributed systems. There aren’t many people who have experience with distributed systems in Mercari US, so that’s an area we’d like to strengthen.
@sudarsan
Advantages of blending different engineering cultures
@kumon: We are a borderless group. How is it working with people who speak different languages and have different cultures?
@sudarsan: If you look at it geographically, we not only have language barriers but also work in different time zones. We hold meetings when it’s early in the morning for some, or in the middle of the night for others. Of course, we could try to schedule multiple meetings so everyone can join at a time that works best for them, but then you lose efficiency. On the other hand, it’s really interesting to work in a team blending together different engineering cultures from different countries!
@hatonee: It may be because of the way employment works in the US, but the development process in Mercari US is super speedy and more results-oriented. The process in Mercari JP is well-structured; they manage their projects well and strive for a better development experience. Recently Mercari US has started implementing this development culture too, and I think we’re starting to see that it’s fitting together well.
@hatone
@ryota: I feel the fusion at US@Tokyo in the Japan office, too. When you’re working in Japan on a service used overseas, it’s hard to feel like you’re really there where the service is being used. But at US@Tokyo, everything we do is connected to the US. Yes, we do face problems here and there, but we really feel that everything we do helps develop our overseas services. I think this is something you can only experience at US@Tokyo.
@kumon: @ryota, you joined the company online in March 2020. Did you find it difficult to get a grasp of the team atmosphere in the beginning?
@ryota: Yes, I did (laughs). I found it interesting that, compared to Mercari JP, Mercari US has a lot of users who aren’t very familiar with our services. Many Mercari JP users are accustomed to using Mercari in a certain way, so they don’t really use new features when they come out. But most of our US users will use whatever we recommend. You can see how different they are when comparing the numbers.
On the other hand, since we’re working on a service meant for overseas users while being in Japan, we can’t see how they react with our own eyes. Data engineers need to consider what kind of data is needed and how it is used when developing. As far as data structures and formats go, we can figure it out from our development environment if it’s something contained entirely within the Mercari app, but things become more complicated when various third parties (such as shipping or payment vendors) are involved. The physical distance in these kinds of situations does make working from Japan difficult.
Mercari US strives for data-driven business growth
@kumon: What kind of engineer do you think would be a good fit (or not) for Mercari US?
@kumon
@ryota: In comparison with well-established companies, we’re closer to a fresh startup where individuals have a lot of discretion. So, I wouldn’t recommend the position to anyone who is highly specialized in a specific field but unwilling to do anything else.
To be a little more specific, they will need to use their expertise in areas such as distributed processing and data science, but also work with the server side if necessary. We’re not exactly looking for a full-stack engineer, but it would be nice to have someone who is willing to do everything they can to help improve our services.
@sudarsan: Luckily for us, Mercari US has been successful with unique projects such as simplifying shipping procedures and creating additional features utilizing AI, which helped us achieve strong GMV growth. However, this is all still on a small scale, and we are still in the startup phase. We are still a lot smaller than GAFA in data size as well. That’s why we want to set up a scalable data analysis platform and automate operations according to the data while we can. We want to do things like enriching fraud detection and content moderation features and visualize user behavior in real time to maintain a sound marketplace.
@ryota: I agree! We continually work for data-driven growth of the business. Until now, we have been rapidly developing whatever’s been required for our services. Now that we’ve established a wide range of features, it would be amazing to use our data to determine what is necessary to improve our services or create new features.
@hatone: Until now, it’s just been a cycle of starting a project, finishing it, and moving on to the next. But now that we have more people and established pipelines, it’s starting to feel like we (data engineers) are coming together as a team. This will hopefully enable us to use our strengths as individuals while working as a team. When I first joined Mercari US, I was actually analyzing data all by myself, so I’m really glad to be working alongside others!
@kumon: Mercari US is shifting from prioritizing speed and output to thoroughness in our work. Systems created prioritizing speed now face issues with maintenance and scalability, and we will take our time resolving these issues. We will continue to prioritize speed for some projects, and will continually be making improvements where necessary.
For me, it’s not exactly that I’m aiming to create a data foundation based on what the data engineers have built on their own and turn it into something that can still be used 10 years later. It’s more like, I want to take what we’ve done and grow to match the phase of business we’re in, and contribute to facilitating a data-driven decision-making and automation process.
If this sounds interesting to you, please consider applying to become a data engineer for Mercari US. We look forward to hearing from you!