We love telling stories behind the daily challenges we face and how we solve them. But we also love hearing about the insights, experiences and the lessons learned by prominent voices in the global community.
In this new series of 1-on-1 interviews, our very own Wix Engineers talk directly with some of the most inspiring minds in the tech industry. From software engineering to architecture, open source, and personal growth.
In this fourth installment in the series Wix’s Senior Software Engineer Natan Silnitsky spoke with Gwen Shapira, formerly Engineering lead at Confluent and the co-author of the books: "Kafka: the Definitive Guide" and "Hadoop Application Architectures", on the challenge and reward of Event Driven Design, on making Kafka clusters elastic and highly scalable, and about transforming Kafka to be truly cloud native.
When you’re done here, don’t forget to check out the rest of the series
This in-depth conversation covers it all, it seems - from challenges and rewards of event-driven architecture, to Kafka unique data streaming attributes, to making Kafka clusters more elastic and scalable, to the challenge of maintaining open source forks, to advice on how to encourage women and girls to pursue careers in engineering. Oh, and also why it is really important to “keep your living room clean”.
Gwen is a gifted speaker and blogger covering topics ranging from the inner works of Kafka, to software engineering, to product management, and has a way of thoroughly explaining complex issues and subjects with great clarity:
Natan: I thought we could start with Event Driven Architecture - what is it really, in the context of microservices? What does it mean?
Gwen: It actually means several different things depending on your approach. At the simplest level, if you're an architect, you can look at it as a collection of design patterns that can’t fit well together. It's actually good to approach it as not just one architectural style. I’m a big fan of Scala, and the reason I like it so much is because it doesn’t force you into any “purist” approach.
Instead it gently recommends immutability, functional style, but you can really mix and match to have the code be readable, performant, etc. And I feel the same way when you go and level up for the architecture. You shouldn't really be dogmatic and say: "Okay, I'm doing event-driven architecture and I use every single pattern, every single thing has to be done this way". This rarely ends well.
It is good to be familiar with those patterns and the programming style because it has a lot of benefits in many cases. It plays well with immutability, where the whole idea is that events are written once and always kept in order, you can replay as many times as you want.
But the fact that I just published a new event that the weather today is 42º Fahrenheit does not change the fact that yesterday it was 60º Fahrenheit. So you keep publishing new events and modify new facts about the state of the world that then services that subscribe to them can act upon. They can be taken as commands, as new information, or used to hydrate their own datastore, that’s a very common pattern.
On the other hand, I can also see that many microservices might be too simple to do all of the above - and all you want is, say, an occasional ping, a request and a response back… So don’t be dogmatic, but be familiar with the logic so you could make good calls, decide which is appropriate when.
Natan: And when it's appropriate, do you see any challenges in working with this architecture?
Gwen: I think for a lot of engineers it's a mental paradigm shift. I don't see too big of a challenge for engineers who are experienced with those design patterns, know how to write them. But for most people this is not something they learn at school, essentially. So they will have to learn that. Sometimes via video classes, hopefully there’s something related that’s already happening in their company and they can learn from a good mentor and by pairing, but a lot of times that’s not the case. And so if you, say, read about a pattern in a book and you try to apply the theory in the real world scenario without a mentor, really trying to figure it out by yourself, there’s a lot that can go wrong.
The other part is that I do feel like a lot of our libraries and frameworks are still fairly primitive. Which is kind of embarrassing to admit - I mean, we are Confluent, we have Kafka… But Kafka is very low level infrastructure at the end of the day. Meaning that for most of the things you actually want to do you will have to find a third-party library. And often you will not find something that’s exactly a good fit. Which means you will need to modify things and invent things, and then, obviously, it gets riskier if you're not experienced and are trying to invent a missing wheel.
There are some event-sourcing-specific databases out there. I haven't tried them - they may be good - but I also haven't seen very wide adoption that would make me certain that they're tested in production at large scale, very stable, survive network partitions, all these kinds of things. It can take a very long time and take a lot of users to productionalize an open source project - that's one of the biggest Kafka lessons by the way.
And then there are entire areas where our community as a whole just does not have great solutions, where, again, without a great mentor, you're in a difficult situation. Take versioning - I think it has always been very difficult. As long as you can evolve things and it's compatible, you can work around it, but if you actually need a new version of a type of event, I think you do not have great options, and that’s a bit sad.
Natan: I guess versioning is also a great challenge in a request-reply model.
Gwen: Yes, I agree, but in request-reply there's at least an accepted solution. You may or may not love it, but it's there. You do /v2 and kind of slowly convert clients from one to the other. Because there is no state, because you never have to deal with all the events, it's fairly straightforward.
The whole problem with event sourcing is that we want to maintain this state of events over a very long period of time, so you have to always kind of provide the way to do something with those v1 events. And I think that is incredibly difficult for a lot of projects - just imagine supporting an event type where the horizon is the next five, ten, even twenty years!
Natan: Well, I don't want to discourage our audience here. I mean, we have Avro and Protobuf that offer solutions.
Gwen: Yes, but I think what I'm missing is the equivalent of database migrations. Databases are stateful and if you want to completely change the schema, you kind of plan those series of migrations and then it changes the current state of the database, so you never have clients that can’t read some parts of the data because it's the wrong versioning. And for event sourcing, basically, it's a big operation. I've seen companies do it - literally copy and migrate a bunch of events - but again it's not an industry standard, you do it if you see the value.
That's what I see as the biggest challenges - we’re missing those tried and true solutions where you can say "this is how everyone does things and this is very useful".
Natan: Let’s talk about the positives then - where does it really shine, do you think? Where can you say: "Oh, this fits like a glove!"?
Gwen: I use it all the time, Confluent uses it all the time! So I feel like it actually shines most of the time! And especially once you get experienced - it pays dividends so long down the road. You have a built-in audit - when a security officer asks us to tell them how many people generated a key of a certain type, for example, we have information of who generated what and when. And we didn’t have to do any extra effort, it was all just there, which is a lot of fun, we really like that!
If you mess something up, going back in history, re-playing, selectively re-playing... We do a lot of it. Especially when we have new features, during development it's very nice. You can copy a log of events from production to development, and basically iterate on it as long as you need to.
We do that a lot for metrics, since Confluent has a lot of automation based on metrics where if something happens in the metrics, we move stuff around, restart things, etc. We can basically stream events, metrics, from production to development and just run it in development as if it was the production, which we like a lot.
And I nearly forgot to mention one more great thing - the code looks nice, the style of coding that’s reactive to events. After you get used to it, it's so natural, so nice to read!
Natan: Nice! Now, we talked in general about event-driven style. Let's talk specifically about Kafka. Kafka, of course, is not your traditional message queue. First, let's talk a little bit about the differences between Kafka and traditional message queues like RabbitMQ.
Gwen: Kafka, actually, wasn’t really built as a message queue. We use it as a message queue, but it was built as a data stream, to transport a huge number of events at a very large scale between a lot of different components in a large company which has tons of different components and a lot of them report events at very high rates.
So, if you look back at Linkedin's first use cases, a lot of it was around trying to get all the information from all those services into their search index. And when you start collecting data from thousands of microservices on tens of thousands of instances, you end up with (and I think we currently do that on a fairly regular basis in a single Kafka cluster) gigabytes per second. And that’s what it was written for, that’s what it was optimized for. So if you think you will have to deal with a lot of events at a very high rate and you need to store them for a long time, that's where Kafka really shines.
Beyond that, we discovered that it is also very useful when you deal with smaller data, with fewer events. I mean, when you build something that is so robust at a large scale, it is going to be very robust at a small scale. So we actually ended up using it for services that don't have that much data. Since if we already publish everything that happens in a service - every change, every new order, every new sign up and registration - and we already stream all of these to search and to our product analytics - why not also allow five other services to subscribe?
And then when we have, say, a new sign up - let's send them a welcome email, let's make sure that the billing is set up, etc. All those things kind of evolve naturally, and it's a pattern that repeats again and again. With every company - and at this point after six, seven years with Confluent that’s probably hundreds of customers - it’s always the same repeating pattern. There’s always someone who needs data that belongs to another service, to another domain where it's in another domain database and they have trouble negotiating the right APIs.
They get the other domain to publish events (or sometimes just slurp it from a database using Debezium) and then you get your data. The first service uses it very successfully and then all of a sudden everyone is saying: "OMG, we have this data, it's accessible, we can subscribe to it, we can do stuff when things are happening in another part of the business, do things we could never do before!”.
Usually from the time that the first project is delivered, after about a year - year and a half, the entire company is addicted to getting subscribed to events and doing stuff with them.
And It's so good for the engineers, it gives them so much freedom to experiment. Because they don't need to convince another team to do something, like to create an API which is usually really hard since everyone is busy, people need to prioritize, etc. Allowing a team to be productive independently because they have the access to everything they need to be productive is like magic for engineering things!
Natan: Absolutely! Let's talk a bit about the future of Kafka. I know it started off paired to Zookeeper and that there’s an ongoing effort now to remove this decoupling. Can you talk a little bit about this journey and what would it mean to decouple Zookeeper completely from Kafka?
Gwen: I think a lot of it is kind of misunderstood. Zookeeper is fine, we don't have serious Zookeeper trouble and - please, don't go and decouple Zookeeper from everything you do with it. Because it is a very, very solid, stable, tried and true technology that serves us well.
What we really wanted is to make Kafka more event-driven. If you think about how Kafka itself works… When you do leader election, for example. You have the controller, it writes state to Zookeeper, and then it starts sending messages to other brokers like "hey, you're now a leader, you're now a follower". And those messages sometimes can get lost in various ways. Now add to that all the usual problems of request-response which we're very familiar with. Now Zookeeper has the entire state, and the state can be incredibly large if you have a lot of topics, a lot of brokers, and that’s a very large state of metadata for Kafka. So if a new controller starts, before it can do anything at all, it has to load all of it. The controller itself sometimes holds leaders.
And now imagine it's going down. It has to now load the entire state before it can even start electing any leaders, so the entire leader election is paused for the duration of loading the state. And this is when we say "don't have more than 200,000 partitions in the Kafka cluster, this is the thing that we’re trying to save you from”, from having the controller take more than 30 seconds to come up because it takes too long to load the state.
Then it has to send the new state to everyone. If there's a lot of them, and they have to respond… all these things are very time consuming. And at some point the light bulb hit that we actually know a better pattern - which is writing all those things to a log of events. We used Raft, which is a synchronized distributed log of events, and said, okay, the controllers will have a controller quorum and they will agree on the log of the changes to the metadata. All the brokers can subscribe to it, read these events, and be up to date - and we will have an auditable source of everything that has happened.
We know it's efficient, we know we can do it at scale, we know it is very low latency - basically it just seemed like a good way of making Kafka more architecturally cohesive…
I like those architectures where if you understand very few central concepts, you can then understand everything that is happening as a result. I think it definitely pushed us to make a step in that direction, we no longer need to force people to understand Zookeeper, which is quite different from Kafka, and we got as a bonus the fact that you can now start Kafka on a single machine, with a single process.
If you think about stuff like RabbitMQ, which enjoys wide adoption because it's very simple, we kind of wanted to give people a similar experience. We didn't like the fact that Kafka is referred to as being "heavy weight", “expert only” and “really hard to manage”.
Natan: Can you at this point share some numbers about the new partition limits? You mentioned them being above 200,000 or is that still not guaranteed?
Gwen: I think we published a benchmark recently where we tested with something like eight million.
Natan: Wow, that's crazy!
Gwen: Yes, well, it was a prototype. One of my roles is leading Kafka performance and so one of the things I learned is that performance is really hard to gain. You need major architectural changes to get from 200,000 to several million partitions - while losing performance is rather easy! You can have one person write one line of log in the wrong place and incidentally... that's it!
Natan: I can really relate to them because Wix is also madly driven by performance. For instance, we have practices in place to check if performance is hit when someone introduces new code.
Gwen: Exactly, we do the same. We have JMH micro benchmarks that we run on every commit, we have nightly benchmarks… It's really hard, even with all the tools, to keep it high quality. Constant vigilance is the right approach to performance.
Natan: There are situations where Kafka is a bit more low level. What do you think about higher level Kafka consumer SDKs? Wix has Greyhound, which offers features on top of consumers in a microservice environment setting. Does Confluent have plans in this direction? I think you talked about Parallel Consumer?
Gwen: First of all, I'm a huge fan of high-level clients and I think the world needs more of them. Kafka is very low-level and there are so many patterns that are so common. Why would everyone need to reinvent the wheel again and again? I feel like in many ways there should be a lot more of them. And then just a lot more discussion around them. I would love to see more community life around those things. Also, anything that will make people comfortable, because when you see something from, say, Wix, you may not know right away if it's stable, how good it is, how many bugs it has, what kind of testing was done here…
Those are the concerns I hear people express each time I say they should use one of the high-level clients instead of writing their own. People get concerned, but then quite rightly so.
Wix, in particular, might suffer in this regard since you do very advanced use of Scala. Often people look at the code and can say something along the lines of "wow, it looks like magic, I'm not comfortable about my ability to figure out this code myself if something goes wrong in my environment”. Documenting, giving people maps, explaining how the code works, where to look for things, sharing examples, explaining how to troubleshoot if something does go wrong - that’s something I think people would really enjoy.
By the way, I really want to give kudos to you for open sourcing! So many companies solve these problems separately. I've tried to convince companies to open source these high level clients since 2018, I think. And most companies either say they have to go through too much "legal", if it's a bank or something like that, or that they are just not “proud” enough of their code to make it open source. Which is always very sad to me because you have to look at your own code day in and day out. Don't make it nice for others, make it nice for yourself!
It's like saying I can’t invite guests over because your living room is messy - but you have to live in your living room!
Natan: It definitely keeps vigilance about code quality even more when it is open source, so I highly recommend doing that. Plus when usage increases, you get bug reports from people, stuff like that - it's really helpful… I think the natural next subject for us to discuss here is the Confluent Cloud - Kafka wasn't originally written for the cloud, but why does it really matter? Where are the challenges in making Kafka cloud-native?
Gwen: There are certain types of things people expect from a good cloud service. And when I say good cloud service, I always imagine something like S3. People don't worry about how many machines their S3 buckets are on, they don't have to worry about how many other people are on the same machine. They get certain guarantees, they get certain experience, the support is surprisingly good - you really don't have to worry about all the details.
It was definitely not true for Kafka a while back, and I think we are still not at the end of the road, where Kafka could be. One of the important things is that everyone always wants to tune brokers and really cares about how many of them exist. People just have to do a lot of capacity planning. And a lot of it is because Kafka used to be very inelastic. So we did a lot of work, both at Confluent and in the community, around tiered storage, which gives us really nice elasticity, the way to very easily and rather quickly scale up a cluster and then scale it back down.
And this is something that people kind of expect. It’s still not fully automated on Confluent Cloud, but it's very obvious that we will get there - and we already have the API, so you can automate it yourself. If you want to scale on a schedule, you can start a cron job, if you want to scale versus metrics, we have a metric API, it's not that hard to tie it into our shrink and expand APIs. Elasticity is one really important dimension here.
Playing nicely with load balancers is something that people rarely think about, but the Kafka protocol is tied to brokers, and we cannot change the Kafka protocol - or rather it's crazy hard. This would be a five-year project if we ever pick it up. So we try really hard to avoid touching the protocol. The protocol requires clients to talk, discover which brokers have leaders for partitions, and talk directly to them. This requires a lot of interesting routing work to happen on a load balancer layer. Kafka assumes long lived connections so if you want to add more capacity to the load balancer you actually have to be fairly proactive about moving connections around. All those are kind of small details, but at the end of the day if you do cloud service, they are kind of important.
Multi-tenancy was obviously a very big one for us in terms of Cloud. Apache Kafka has all these tools required for multi-tenancy, which is mostly its quota capabilities, but in Confluent Cloud we did a lot of automation with very smart algorithms to allow tenants to share clusters - as long as there are resources, they will split them intelligently, and then once we get close to capacity, we can obviously grow automatically. So we have pretty good isolation multi-tenancy inside Kafka, we are very proud of it, it's really unique. It's useful for Confluent Cloud, but also for a lot of businesses that have multiple logical tenants in them.
Cloud is, for better or worse, kind of associated with REST APIs. So the other thing we did was to really increase the investment in the Confluent REST proxy and add things like admin capabilities. We're still working on it, but we're really trying to make it a first-class cloud citizen. It's the experience the world expects. I'm a big fan of the Kafka protocol, but we'll see.
Natan: Exciting times ahead! I was wondering if you could talk about the behind the scenes of the architecture of elasticity you spoke about. Depending on how you implement it, it sounds like you have a pool of brokers… How does that work? You said the protocol itself was limiting, so it's really interesting to hear about that.
Gwen: The main limitation of the protocol is that in order to use a new broker, you need to have some partitions on it and clients that produce and consume those partitions. In the past, if you had a very busy cluster, it was very difficult to move a partition from one place to the other, because it could be moving terabytes of data over a fairly busy network. Even just figuring out what to move was a rather challenging algorithmic problem. Which an incredibly large number of companies resorted to doing either by hand or with fairly random scripts. So we wanted to attack both problems. We wanted to take the part that makes it easier to move things around, and this we would do with tier storage.
Still a work in progress in Apache Kafka, but fully in production for several years now in Confluent Cloud, we actually only store a tiny bit of the data for each topic on the local disk. Most data is tiered to S3, which means if I need to move something, for each topic I usually need to move on average, maybe, 50 MB. That's obviously a big improvement from Gigabytes or Terabytes. All those movements can happen really fast and it really speeds up the time until those new brokers are fully utilized.
A lot of people were kind of worried about the performance implication of tiering to S3. We found that it actually solved one of the biggest concerns people had with Kafka. People used to be always concerned that if they read all data, it would mess up their caching and would slow down reading and writing of new data. Luckily, if you read from tier storage in S3, it's a network read, it does not go through the page cache. The initial bytes are higher latency, but then it's higher throughput, so everything after the initial bite you get at normal read-from-disk speeds. And since it does not go through page cache, everything that’s working from memory will just keep running, business as usual. We actually found some optimizations around precaching, some other things, so it actually ended up not just helping elasticity, but really helping performance in ways we did not initially fully expect. That was pretty cool.
And then there's the algorithm for moving stuff around. So now that it's easy to move partitions, you just want something that will automate it and will choose the right partitions to move over. In the community, the tool of choice is Cruise Control - recommended, a lot of companies use it. The main thing is that it's finicky. You need to tune it, essentially. 90% of the time it will not work out of the box. It works well, but the problem is that I have several thousand clusters to manage and I can’t hand-tune each and every one of them.
So we built algorithms that are slightly more general and slightly better at taking into account current behavior of the cluster - and it works. We borrowed a lot from Cruise Control. I think if you run the Confluent platform with self-balancing you may even see some very familiar things in the log. But we really had to make a lot of the tuning, kind of self learning, essentially, in order for us to not have to babysit every single cluster in our fleet.
Natan: I always wondered, regarding Confluent platform, Confluent Cloud, how it’s structured? In a sense that we have the open source Kafka code base and the Confluent code is on top of that? Or is it more like a fork of that? How does that work?
Gwen: That's actually a very good question. We started out with plugins. We basically plugged extra stuff into Kafka. That took us quite a long way, and it was fairly easy to maintain. At some point we decided we actually wanted to modify things that are more core. They are still fairly separate, so we built the entire balancing work as a plugin. But the integration point is not as simple as “load this library into a well-defined interface”, It's more closely integrated with the controller. And we did this in several places, so it started to accumulate after a while…
I'm continuously thinking about this choice. The main benefit of being forced into the plugin model is that it forces you to define really nice interfaces. And I miss that. A lot of things that were kind of integrated together get me to say something along the lines of "can you clean it up?" or "can we make these boundaries better defined, can we have clearer separation of responsibility?", things like that. If you don't have this hygiene, it's too easy to create something that is not as clean as you'd like it to be.
But on the other hand, there are cases where putting the extra effort into very clear interfaces would not necessarily make sense, where things are just naturally well integrated. You actually want to make the modification for something that is very core, so it would have limited us if we insisted on doing this model for everything.
And now we have a price to pay. We still want to keep up with upstream, which means that every single change that happens upstream we have to merge and some upstream changes are incredibly substantial. So we have a lot of people working full time (not as a job description, but for a specific period of time) basically just bringing in the changes, merging them, figuring out how things are going to work together. It's not at all like an easy merge where you just click on the “Merge” button.
It requires our own testing, and really our own design work - how will this new topic ID, as an example, how will a topic ID work with our isolation mechanisms for multi-tenant? Someone has to design it. It's not just "I'm going to merge and it will magically work" at all… But I think we are doing the right thing for the community and for Confluent Cloud by working the way we do.
Natan: I read that when you started your Engineering career, you chose to become a DBA, and that’s a role that’s stereotypically associated mostly with men. Could you talk a bit about your personal story? How did you end up where you ended up?
Gwen: I guess nobody told me about the association with men (laughing). I started my career as a software engineer, but did my degree in Computer Science and Statistics. And I was happy that very early on my job was around data analytics - so writing a lot of queries ,creating very nice dashboards. And I got to know databases pretty well because, well, analytics depends on good database skills. I spent a large amount of time tuning the database - I was always passionate about performance, I spent a long time playing with indexes and different structures, thinking about optimizing things. Making my dashboards extra fast was very enjoyable to me.
And then, maybe two years into my career, our amazing, rockstar, world famous DBA left, and I was quite inspired by him, so I was thinking hey, I'll ask to replace him, essentially. And remember, I had two years of experience and he had like, I don't know, twenty, a world famous DBA. And I was like “can I get his room?”.
But the company was fairly supportive and the funny thing was that when I asked friends and colleagues if they thought I was making a good career move, everyone told me it was a good job for a woman. I didn't realize that it was actually rare for women to do this. And always, throughout my journey, there were always really impressive women around. I remember, and I'm blanking on names because it was 20 years ago, but very early, when I became a DBA, I started working closely with a superstar expert at Oracle Israel. She was a woman and so impressive, I learned a ton from her. And then I moved to a different job with Pythian, which was kind of a database consultancy, and the top SQL server expert was a woman. We very soon hired a world renowned Oracle expert, she was one also. So it felt like there were always very, very impressive women around me. I think until now I never realized that there is a stereotype around it. I see.
Natan: That's the nature of stereotypes, I guess. Do you sense any changes in the working environment since you started your career? What's the balance between men and women programmers at Confluent, and in management roles? How do you see that?
Gwen: Things have definitely improved a lot. When I started my career, it was still fairly rare to have women engineers. In Israel it was better than in the United States. Partially, I think, because of the army - they ended up having a lot of women who just had, you know, mathematical inclinations, sorted into engineering roles. And that for them felt natural to continue. In the US… There were definitely so many meetings where I was the only woman in the room.
Confluent is good about it. I think we have around 30% women in engineering, something around that. But it skews junior for sure. Because I think there's a lot more young women engineers than older ones, because the pattern was incredibly skewed back then, and now it's a lot more even. And I'm really excited to see how fast people grow! I want to think that Confluent is a supportive place for women to grow in, we pay attention to whether we’re promoting male and female at similar rates, for instance. And if we notice that we don’t, then we always ask ourselves where we are failing. There's never the assumption that women inherently have something that makes them any less worthy of promotions, so if we see very different rates, it's a cause of concern.
Managers with very skewed hiring rates - if you hire eight people and all of them are guys - it is going to show up in your performance review. This is not considered a successful manager, you have to work on it. The effort is there. And we also have really strong representation of women in the senior ranks. Rajini Sivaram is the principal engineer at Confluent, she is the coauthor of my book - there are four co-authors and two of them are women. She is one of the top Kafka contributors, a PMC member, one of the smartest, best engineers I've ever met.
My organization, that Cloud-native Kafka, has 3 managers all of whom are women. I was actually super worried about hiring the third woman manager. I was like "oh, my God, what will people think when all of the managers in my org are women?”
Natan: A man would have never thought like that…
Gwen: Exactly! That's exactly what my manager told me - look at all the other guys who only manage guys! Just hire the best person for the job, don't look at gender. So that's how I went with it and it definitely paid off. And then it's not just my org, we have women managers throughout Confluent, women in senior positions all the way until the Director roles under our VP of Engineering, we have great women there. Everyone has a good role model to learn from, which is fantastic.
Natan: That’s very inspiring to hear! Can you provide any tips for our readers on how to expose their daughters and nieces, not only to programming, but also encourage them to pursue software engineering careers?
Gwen: First of all, you're right, you have to expose people. Parents and relatives recommend hobbies - so alongside with suggesting playing the piano or playing basketball, suggest trying to take a programming class, tailored for their specific age. And that's the whole point of being a kid, right? Just try different things, figure out what you’re about. Definitely make sure it’s tried - whether it is done through online games, there are a lot of programming-oriented games out there these days, or via classes.
It is mostly important to encourage them, because many people like different things. And it's ok, not everyone has to enjoy programming. But you really have to be careful not to discourage them, since that’s where things start - at a very early age.
When I was in high school, I tutored younger kids in Math, that's how I made some spending money. I still remember, like 30 years later, that there was this very bright 8-year-old girl who told me she wished she was a boy because then she'd be good at Math. But A, you are good at Math, and B, even if you were not…. How? What? How did she get this idea?! So it's really important to make sure they don’t get these ideas.
There were other careers that I was discouraged from, when I was a kid I wanted to be an orthopedic surgeon. My close relatives were like "no, women cannot do orthopedic surgery - it's physical, it requires lifting very heavy things, you have to wake up in the middle of the night, how will you have kids and be an orthopedic surgeon?”. They did not know that engineers also have “on call”, so... You don’t have to encourage as much as you need to expose and avoid discouraging. They will make their own choice!
Natan: Thank you so much for this interview - I really enjoyed it, I think we covered a lot of interesting subjects and you provided great insight that I think everyone can benefit from.
Gwen: Thank you so much for having me!
Bio - Gwen Shapira, Co-founder and CPO at Stealth Startup
Gwen is currently building a new product at a company that she co-founded but isn’t ready to talk about. Before taking this exciting step, she was an engineering director and lead the Cloud Native Kafka org at Confluent. Gwen spent many years designing scalable data architectures and leading open source communities. She’s an Apache Kafka PMC member, an author of “Kafka - the Definitive Guide” and a frequent presenter at industry conferences.
For more engineering updates and insights:
Join our Telegram channel
Visit us on GitHub
Subscribe to our YouTube channel