“You keep using that word. I do not think it means what you think it means.”
– Inigo Montoya (Mandy Patinkin), The Princess Bride (1987)
In our industry, one can quickly become overwhelmed with all the buzz words that get thrown around. I’ve even gone as far as handing out “buzzword bingo” cards (with small prizes for the winner) in presentations where I felt my own content was over the top – a small consolation and an attempt at self-deprecating humor. That said, the overuse of buzzwords without sufficient context and or mutually agreed definition also leads to widespread misinterpretation. This will be the subject of my post today, with a focus specifically on what the term “Lake House” should mean.
I have the pleasure of living in the great state of Minnesota which is also known as the “Land of 10,000 Lakes.” As a result, the term Lake House evokes a different visual in my mind than it may for many others working with data on a day-to-day basis. Here in Minnesota, we talk about going to the cabin while our friends in other parts of the Upper Midwest or Canada might refer to a cottage, and I’m told people in Maine call it a camp (regardless of the structure type). No matter what you call it, these are all some form of Lake House and their main premise is providing access to water so that you can engage in a wide variety of activities like swimming, water skiing or fishing depending on your personal preferences. With that overarching theme of access to water, there are also a few important attributes for this type of Lake House that you can consider:
- Proximity – Being close to the water is invaluable. Whether you have your own slice of lakeshore to swim and boat from, or you are just nearby enough to quickly make your way in and out. No one wants to spend a significant part of their day in transit to/from the fun. Having access to a dock makes life easy and enjoyable – who doesn’t love sprinting out the back door and straight down the dock for that refreshing leap into the lake?
- Security – There are two important aspects to consider here, one being shelter and the other being protection of property. The Lake House must provide a comfortable respite where people are safe from weather or other threats – in other words they need to trust the environment. In addition, it needs to be secured so that access is only granted to the guests you deem appropriate.
- Features – This list is as varied as the individuals who own them, but I think everyone has a few must-haves to make their time at the lake enjoyable beyond just the water and the good company. For me, those are a grill, a fire pit, lawn games, cold beer, and something to listen/watch sports on. I’m sure everyone has their own list to fit their style.
Naturally, when I heard some buzz over the past couple of years about Lake House architectures, I assumed, perhaps incorrectly, that this was meant to be an analogy on the above. Truth be told, I didn’t even recognize that it was a portmanteau of “Data Lake” and “Data Warehouse” until later – which was met with disappointment on my end. Eventually, I found myself saying “You keep using those words. I do not think they mean what you think they mean.” Put simply, I felt that a Lake House needed to offer more than merging data lakes and data warehouses, and certainly its focus shouldn’t be on more data movement.
To that end, let’s explore for a moment what it might mean to have a data Lake House that mimics the qualities of a cabin in Minnesota. Obviously, that starts with the general premise that the purpose is to provide access to data. Breaking down that theme again using the same attributes might yield the following considerations:
- Proximity – Being close to the data doesn’t necessarily mean the technology should hold the data itself (collect), but this is about optimized access (connect). To get the most out of your data, the lake house needs to provide immediate access to business data as it happens – avoiding latency in both the preparation of and the connectivity to your data.
- Security – This one just feels obvious – we can no longer give users wide open access. The lake house must ensure proper controls without hampering business insights. Doing this well goes beyond authentication and authorization by providing features for masking, anonymization so that users don’t have to choose between privacy and innovation.
- Features – Again, this can quickly get out of hand, but I think there are some base features that are expected in addition to the optimized connectivity and privacy features already mentioned. The most obvious is the ability to store data efficiently when virtual connectivity isn’t viable – and this includes proper tiering based on data access patterns. Another key need is the ability to natively offer multi-model storage and processing capabilities, such as a JSON document store, graph analysis, spatial data types and calculations, and machine learning algorithms. For a Lake House, providing a single consumption layer for a wide variety of services is the key to giving users the flexibility to deliver tremendous value.
Ultimately, I believe we should expect more from a Lake House in the context of data. There’s so much potential value when we think beyond the dichotomy of either hoarding or isolating data, and instead, focus on opening access in a super-efficient manner while also balancing enterprise control with advanced processing needs.
This isn’t the right post for me to write about the value of SAP HANA Cloud and SAP Data Warehouse Cloud in fulfilling the bigger expectations I’ve described for a lake house, but I will say that I think they’re uniquely positioned to do just that. I’ll write more on that soon and will link it here, but I would also welcome anyone to reach out to me directly on this topic.
In closing, I hope you can now see why Inigo Montoya was right: “Lake House” may not mean what many think it does. More importantly, I hope you have a renewed perspective on why I believe it should mean much more.