In July 2012 I participated in a BMC Big Data Summit that took place in BMC headquarters in Houston, Texas. I met architects from many of the BMC product lines and we discussed Hadoop and Big Data opportunities along with customers use cases, challenges and needs (I’ve described some of the Workload Automation use cases before in an article published in the Enterprise Systems Media magazine). This is the “behind the scenes” story of the team that made Control-M for Hadoop a reality.
Amit Cohen, Product Management
As soon as we had the green light for the project, R&D started the technical research and we identified together the potential content for the first release. With the Big Data market and the Hadoop ecosystem being so dynamic, we wanted to ensure that what we developed actually addressed customer needs. We validated both the potential content and the use cases with customers in North America, EMEA and AP, and adjusted our plans according to their input. The support for HDFS file watching for example was added to the content based on such feedback.
We learned what type of challenges companies experienced when using Hadoop specific schedulers (such as Oozie) and realized that we can deliver immediate value by offering the ability to manage Hadoop batch jobs with the same power and ease of the “traditional” enterprise processing. Control-M for Hadoop allows application developers to focus on developing Hadoop programs rather than wasting time with writing and debugging wrapper scripts that schedule those programs.
The main challenge that customers shared with us was the ability to integrate Hadoop jobs with data integration activities, analytics tasks, and with file transfers – types of integration for which Control-M was specifically designed. Proactive notification on missed SLAs and self-service offering for application developers were asked for as well. Having Control-M able to easily integrate mainframe and distributed tasks was also a key factor for those customers who shifted data processing from DB2 to Hadoop in order to reduce MF processing costs.
Avner Waldman & Shaul Segal: Research & Development
We have always been passionate about learning new technologies but the team excitement increased to a new level after the discussions we had with our customers and the understanding of the value we would be providing them. We learned that we had customers running Hadoop jobs with Control-M for years, using homegrown wrapper scripts, but looked for a tighter, more “native” integration that would reduce effort and risk. We helped them eliminate these scripts by replacing them with jobs defined from a simple and powerful graphical user interface.
We started the research on a couple of single-node Hadoop clusters but this wasn’t enough for us. We knew that our customers’ Hadoop environments were more complex, and that Control-M for Hadoop must support these environments. We tested various Hadoop distributions in multi nodes cluster configurations. Our virtualized infrastructure allowed us to provision Linux instances quickly and the feedback from our customers helped us to configure the Hadoop clusters in a way that is as similar as possible to their environments.
Gad Ron: Architect
Being involved with the first BMC Big Data initiative was the thing that got me excited the most. Other product lines are now following us with developing additional offerings around Big Data, but we got to be the leading team. We’ve been “playing” with Hadoop for a couple of years now and finally reached a point where the market demand for enterprise support justified the development costs. Now that the first release of Control-M for Hadoop is available and customers are adopting it, we have a larger community to get feedback from and we are already looking into additional use cases for the next release.
I love the idea of new and innovative technology that is mostly batch oriented. Over the years we heard people saying that IT is turning to a completely online driven approach but the truth is the exact opposite. It’s like saying that the mainframe is dying…
Analyst predictions on the Big Data market growth encourage us to invest in additional research in Big Data technologies. NoSQL databases and in memory databases (such as SAP HANA) are now on the table as well, next to the social, mobile and cloud initiatives. We are also witnessing a trend of silo Big Data exploratory projects turning into enterprise-wide Big Data initiatives. Our customers are looking for tools to support such a shift.
Oranit Atias: Project Management
Communication is always a key factor in these types of projects. We all learned the new technology together, shared customer inputs and worked collectively to ensure we met our project deadlines and quality standards. In fact, we were able to complete the project ahead of time. The alignment of all participating teams including documentation and support to the dynamic nature of the project was all I could ask for. When I saw the Times Square ad and Control-M for Hadoop on the www.bmc.com front page I couldn’t be more proud and felt privileged to be part of the team.
The feedback from the customers that evaluated preGA releases of Control-M for Hadoop helped us to design and execute the testing use cases. We made sure that our testing coverage included the same platforms and configurations that those customers use and they in return now have a much more stable and robust solution that meets their needs, IT standards and configurations. The learning curve of the new technology was relatively short due to the fact we’ve been there with R&D and product management since the beginning of the project. We participated in the technical research, the discussions with the customers and the release specifications planning. This was a truly team effort. We ended up with a shorter release cycle and eventually a better product.
Robin Reddick: Solution Marketing (@robinreddick)
Working on Control-M for Hadoop let me do two of my favorite things as a solutions marketer – bring a new product to market AND in a new market area. I began researching the big data market opportunities well over a year ago. With all of the initial big data processing being batch (scheduled), it was a perfect fit for Control-M and just a matter of the right time.
Reaching out to customers, understanding their needs, and then working with product management, sales, and other stakeholders in the company – it was all great fun, and of course challenging as well.
The best part of the entire effort was working closely with customers to understand their business needs. Every customer I spoke with was using Hadoop and big data to learn more about their business –to make better informed decisions. Each of them was passionate and excited about the opportunities they were finding to offer better and even personalized service, new products, and improve their business operations. Their excitement was infectious.
Memorable moments? The first meeting with a customer, which was MetaScale, and understanding just what a game-changer big data really is for businesses. The first conversation with a sales rep — they called Hadoop “Hoopla” the entire conversation making me realize how important training would be. And throughout — working with my team-mate Joe Goldberg who was relentless and remarkable the entire time.
It’s not over yet. This is just the beginning!
- Using Apache Pig and Hadoop with ElasticSearch with The Elasticsearch-Hadoop Connector
- Working with Streaming Twitter Data Using Kafka
- Real Time vs Batch Processing vs Stream Processing: What’s The Difference?
- MongoDB Sharding Explained
- Apache Redis In Memory Database