Monday, January 12, 2015

Enterprise Architecture Principles worth considering...


There are many architecture principles defined by various Enterprise Architecture frameworks. I’m only going to touch on some of most important architecture principle that can help enterprise benefit and achieve maximum value of IT. Architecture Principles are high-level definitions of fundamental values that guide the IT decision-making process, serving as a base for the IT architecture, development policies, and standards.
The principles of architecture define general rules and guidelines to use and implement all information technology (IT) resources and assets throughout a company. They must reflect a level of consensus between several corporate components and areas, constituting the basis for future IT decisions. Each architecture principle must focus mainly on business goals and key architecture applications. And remember principle means- “Service Above All”. I might go in to some principle in details and some may be one line, because they all are with valuable to enterprise.

Here’s one of the most important principle -
IT and business alignment
To get success in IT, management should make decisions those are always made under the business alignment perspective in order to generate maximum benefits for the company as a whole. A better alignment between IT and the business must generate a competitive edge for the enterprise. Decisions based on the corporate perspective have greater long-term value than decisions based on a certain perspective of a group with a specific interest. An optimal ROI requires information management decisions to be aligned with the company's priorities and positioning. No single area must affect the benefit of the company. This principle, however, must not prevent anyone from performing tasks and activities. Aligning IT with the business and promoting optimal corporate benefits requires changes in how information is planned and managed. Technology alone is not enough to promote such changes. IT architecture must implement a complete IT vision that is focused on business. Application development priorities must be established by and for the entire company. Application components must be shared among all areas in the enterprise.
Maximum benefits at the lowest costs and risks
Strategic decisions for solutions must always strive to maximize benefits generated for the business at the lowest long-term risks and costs. And Decisions must not be made based solely on reaching lower solution costs. Every strategic decision must be assessed based on cost, risk, and benefit perspectives. Lower costs often represent greater risks and, perhaps, fewer benefits.
Business continuity
I don’t think I have to say anything on this…we all know ..Corporate activities must be maintained, despite system interruptions
Compliance with standards and policies
Corporate information management processes must comply with all applicable internal policies and regulation.
Adoption of the best practices for the market
IT activities must always be aligned with the best practices for the market regarding IT governance, processing, and management.
Information treated as an asset
Information is a valuable asset to the company and is managed based on this concept.

Information security
In my view this one is one of the most important principles. Information is protected based on integrity, availability, confidentiality, incontestability, and authenticity. Every piece of information is submitted to a security assessment based on those five factors.
Security must be planned in data elements from the beginning, rather than added later. Systems, data, and technologies must be protected against unauthorized access and handling. The source of information must be protected against unauthorized or accidental modifications, fraud, catastrophes, or disclosure.
Data security can be applied to restrict access to read-only or no-reading statuses. Sensitivity labels must be established for access to temporary, decisive, confidential, sensitive, or proprietary information.

Convergence with the enterprise architecture
The convergence with the enterprise architecture takes place as new applications are built, new technologies are implemented, and older systems are updated or decommissioned. Exceptions to the enterprise architecture might be supported for specific cases if there is a consensus that the benefits of using a solution from a specific technology exceed those arising from the adoption of the enterprise architecture.

Low-coupling interfaces

Implement Micro services, Low-coupling interfaces are preferable, because when interfaces between independent applications are highly coupled, they are less generic and more susceptible to causing unwanted, secondary effects when they are changed.
You could go on and on this topic…and I think if we all as an Enterprise Architect, stay on top of the enterprise architecture principles we could deliver outstanding result in IT.

Saturday, March 24, 2012

Cassandra vs MongoDB


Big data has become a common discussion item within enterprises or developer community and everyone trying to solve this puzzle. It has become almost a challenge for developer and architect community to choose right product for their applications. There are few outstanding solutions are available in industry, starting from Cassandra, Hadoop, MySQL, MongoDB and Riak to name few. All of them have some great features and being used by enterprise lie Twitter, Facebook and Netflix. I’m not trying to promote one over other, but thought I would share what I have learned so far. while I have played with some of these solutions and read about others. Here’s I’m trying to list some features and benefits from Cassandra, MongoDB and Riak. Each of these have some advantages and some disadvantages. Here you go..

Cassandra

Cassandra is an open source distributed database management system written in Java. It’s designed to be a highly scalable second-generation distributed database. In Cassandra documents are known as “columns” which are really just a single key and value, , also got Big Table like features columns & columns families. You can query by column, range of keys. There’s also a timestamp field which is for internal replication and consistency. The value can be a single value but can also contain another “column”. These columns then exist within column families which order data based on a specific value in the columns, referenced by a key.
In Cassandra, nodes represent ranges of data.  By default, when a new machine is added, it will receive half of the largest range of data.  You can change this behavior by choosing different configuration options during node start-up.  There are certain configuration requirements to ensure safe and easy balancing, and there is a rebalance command that can perform the work throughout all the data ranges. It comes with monitoring tool that allows you to track the progress of the re-balancing. Cassandra is much lighter on the memory requirements, especially if you don’t need to keep a lot of data in cache. Cassandra requires a lot more meta data for indexes and requires secondary indexes if you want to do range queries.
One more advantage of using Cassandra, it has much more advanced support for replication. The server can be set to use a specific consistency level to ensure that queries are replicated locally, or to remote data locations. This means you can let Cassandra handle redundancy across nodes, where it is aware of which rack and data center those nodes are on. Cassandra can also monitor nodes and route queries away from “slow” responding nodes. You can choose between synchronous or asynchronous replication for each update. It has got highly available asynchronous operations.
I have already discussed about advantage of Cassandra, it would be wise to point out disadvantage. Cassandra replication settings are done on a node level with configuration files whereas MongoDB allows very granular ad-hoc control down the query level through driver options which can be called in code at run time.
Best used: When you have more writes compare to read, sometime like logging events. Financial institute are great example as they care about each activity and logs more data.
http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html


MongoDB

Unlike Cassandra which is written in java, MongoDB is written in C++ and provided in binary form for Linux, OS X, Windows and several other platforms. It’s extremely easy to “install” – download, extract and run.
Mongo provide Master Slave replication, auto failover with replica sets. It has sharding build in. it usages memory mapped file for data storage.

In MongoDB replication is achieved through replica sets. This is an enhanced master/slave model where you have a set of nodes where one is the master. Data is replicated to all nodes so that if the master fails, another member will take over. There are configuration options to determine which nodes have priority and you can set options like sync delay to have nodes lag behind.
This might be very important information for you, in case you plan to use data as part of audit or other words you can’t afford to lose data. You might want to think again before choose Mongo as writes in MongoDB are “unsafe” by default; data isn’t written right away by default so it’s possible that a write operation could return success but be lost if the server fails before the data is flushed to disk. This is how Mongo attains high performance. If you need increased durability then you can specify a safe write which will guarantee the data is written to disk before returning. Further, you can require that the data also be successfully written to n replication slaves.
MongoDB drivers also support the ability to read from slaves. This can be done on a connection, database, collection or even query level and the drivers handle sending the right queries to the right slaves, but there is no guarantee of consistency (unless you are using the option to write to all slaves before returning). In contrast Cassandra queries go to every node and the most up to date column is returned (based on the timestamp value).
MongoDB is mix and match solution from both world, NOSQL and RDBMS. MongoDB works very similar to relational databases. You create single or compound indexes on the collection level and every document inserted into that collection has those fields indexed. Querying by index is extremely fast so long as you have all your indexes in memory.
Best used: If you need dynamic queries. If you prefer to define indexes, not map/reduce functions. If you need good performance on a big DB. 

I'm going to leave this here and will try to cover Riak with my next post. 

Saturday, January 21, 2012

What's Gradle?

I think title for this post should be “Best of both Worlds - Maven & Ivy”. In todays world these are the most commonly used configuration management & Build Management tool. Here i'm assuming that you have prior knowledge of Maven or Ivy. And If you had chance to play with Maven or Ivy, most of you would agree with my next statement. Ivy is far superior dependency management tool then Maven and Maven is far better build integration automation tool then Ivy. Developers and Architects needed a tool that can combine functionality of these two outstanding tools.
Gradle, in my opinion has been evolved from Maven & Ivy. As we all know both Maven & Ivy are excellent tools, but both of these have some limitations, and that’s where Gradle is going to win!
It can work with Maven or Ivy repository to jars or directory to local file system. One interesting part is, different from Maven or Ivy, Gradle scripts are Groovy based not XML.
Here’s excellent user guide for Gradle.

http://gradle.org/docs/current/userguide/userguide_single.html#multiProject

Friday, December 16, 2011

Open Source Adoption Will Continue To Grow

We are in to last month of our journey for 2011. This year we have already seen lot's of enterprise going towards open source solutions, vendors are having hard time to compete with this open source adoption emerging market. As we enter in to new year "2012", It wont be hard to predict, this pace will continue to grow faster and faster. More and more enterprise are leaning towards open source solutions. I don't think think this pace of adoption will slow down in near future also, and why it would be?? With open source tools, technology, PaaS providers enterprise can build systems, sites and could achieve up time close to five 9 (99.999%).

Although, enterprise are slowly but steady adopting these services from various cloud provider, By definition, the dynamic of this model removes a lot of the friction that companies experience when adopting open source technology stacks. Think about it, would you still be concerned about using Ruby or MySQL if VMware, Red Hat provisioned, hosted, managed and scaled the technology for you in a very elastic, self-healing infrastructure?

From language to cloud services, open source adoption has almost forced companies to look beyond their umbrella of technology, tools and why not? Many companies (Facebook, Twitter etc) have proven that with only open source solution they can run their business without any issues, as they can scale on demand. I can think of one example in Enterprise Service Bus, Mule is stand out in comparison to other vendor provided ESB’s.

PaaS providers – VMWare Cloud Foundry & RedHat Open Shift Platform, can enable the foundation to host, manage, provision and scale solutions based on some of the most renowned open source technologies such as Ruby on Rails, Hadoop, MongoDb, MySQL and cassandra. Open source

PaaS platforms represent one of the fastest-growing cloud computing models. With vendors like VMware, Salesforce.com and Red Hat leading the way, open source cloud platforms enable the benefits of cloud computing on some of the most popular open source technologies in the market. By providing a model to host, deploy, scale and manage open source solutions, this model overcomes some of the major challenges that have prevented big enterprises from widely adopting open source technologies.

Tuesday, November 8, 2011

Spring Introduce Environment Profile


Spring 3.1 introduce notion of an Environment. This abstraction of profiles has been integrated throughout the container. As always, spring community trying to help developer by introducing this very useful concepts, I do think this will be very useful for application and will definitely reduce project timeline. Application now can easily switch between profiles. Like environment-specific bean definitions, like the need to register certain bean definitions in certain contexts, while not in others. You could say that you want to register a certain profile of bean definitions in situation A, and a different profile in situation B. it's important factor to understand that the Environment contains information about which profiles (if any) are currently active. When the Spring ApplicationContext above is loading our three bean definition files, it pays close attention to the attribute in each of them. If it is present and set to the name of a profile that is not currently active, the entire file is skipped — no bean definitions are parsed or registered. You would use same way to load ApplicationContext – spring container, which will activate your profile. 

Play Safe - I would still play safe and use this only when it’s necessary to do this, like you only changing some bean values. You can use this if sets of beans are similar between dev or production. 

Sunday, October 16, 2011

Analytics

Data –In my opinion its single most important piece of information about any organization. And most familiar analogy out there is – Data Warehousing. I’m not going to take a deep dive in to Data warehousing here. As there are many sites you can go and read about DW and Business Intelligence and how many organizations are using BI. Just in case, if you get lazy and don't want to read in details, i thought of giving a brief description about Business Intelligence/Analytics; and why it’s VERY important for enterprises to enable BI with their IT operations.

Why we need it?

Today with industry trend changing day and night, it become so essential for every organization to understand or analyze data. Most of the IT organizations collect data or use data to predict future or growth for enterprise- it could be retail demand forecasting, supply chain inventory management or customer related data.

Great organizations - Today great organizations distinguish themselves from others as they already using Business Intelligence with their process and operations. While others organizations are getting ready or in a process of enabling BI with their IT operations, so that they can get prepared and be ready to improve their business.  

So what's Analytic?

Analytics is all about gaining insights from the data for better decision making. BI is taking a big lead in IT organizations across the world using data-driven insights for strategic, financial and operational excellence. As these organization using predictive analytics to understand future growth plans for their business. There are many technology company provides tools for BI/predictive analytic.

Here's one Solution: While there are many many solutions are out there...

Recently I came across something interesting and thought of sharing- Oracle announced Exalytics Business Intelligence Machine the world’s first engineered system specifically designed to deliver high performance analysis, modeling and planning. Built using industry-standard hardware, market-leading business intelligence software and in memory database technology, Oracle Exalytics is an optimized system that delivers answers to all your business questions with unmatched speed, intelligence, simplicity and manageability. For more information: follow Oracle Exalytics

Friday, September 16, 2011

NoSQL to NewSQL


Wow!! Its amazing how technlogy changing, it was few months back I was reading all good thing about NoSQL solutions. Recently came acorss another big buzz word, called “NewSQL”. 
Its like going from SQL to NoSQL and now NewSQL. Let me try to point out current major trend in the industry.

-NoSQL databases, designed to meet the scalability requirements of  distributed architectures, and/or schemaless data management requirements, 

-NewSQL databases designed to meet the requirements of distributed architectures or to improve performance such that horizontal scalability is no longer needed. 

-Data grid/cache products designed to store data in memory to increase application and database performance.

While NoSQL is still emerging and gaining popularity, such as MongoDB and Cassandra, as an answer to the limitations of traditional database systems.
Like any other technology there are some short coming or other problems: Many of the NoSQL solutions doesn’t provide ACID (atomicity, consistency, isolation, durability)-level operations, a widely used set of metrics that assure that a database-driven online transaction is carried out accurately, even if the system is interrupted. Assuring ACID compliance can be written in at the application layer, though writing the code for such operations.
Lastly, each NoSQL database comes with its own query language, while most of these follow key, value pair but still making it difficult to standardize application interfaces.
In contrast, NewSQL can provide the quality of assurance associated with SQL systems, while offering the scalability of NoSQL systems.
The NewSQL approach involves a number of novel architecture designs. It eliminates the resource-hogging buffer pool by running the database entirely in main memory. It removes the need for latching by running only as a single thread of the server (though some overhead would still be needed for other locking operations). And expensive recovery operations can be eliminated in favor of using additional servers for replication and failover.
Also, one of the most important aspect enterprise don’t have to re-write application to code against NewSQL solution.
It's amazing in short duration there are many NewSQL provider available in the industry to choose from. 
Here are some NewSQL solutions - Clustrix, EnterpriseDB, GenieDB, ScalArc, Schooner, VoltDB, RethinkDB, ScaleDB, Akiban, CodeFutures, ScaleBase, Translattice, and NimbusDB, as well as Drizzle, MySQL Cluster with NDB, and MySQL with HandlerSocket. The associated “NewSQL-as-a-service” category includes Amazon Relational Database Service.


Wednesday, August 24, 2011

RDBMS to NoSQL

RDBMS: The Awesome and not to so much

Relational database (RDBMS) technology, a “scale-up” technology that has not fundamentally changed in over 40 years, continues to be the default choice for holding data behind Web applications.

Database technology has not kept pace. Relational database technology, invented in
the 1970s and still in widespread use today, was optimized for the applications, users and infrastructure of that era. In some regards, it is the last domino to fall in the inevitable march toward a fully-distributed software architecture. While a number of bandaids have extended the useful life of the technology (horizontal and vertical sharding, distributed caching
and data denormalization), these tactics nullify key benefits of the relational model while increasing total system cost and complexity.
Long story short, you could do and try many ways to scale RDBMS, like
Sharding
Denormalize
Distributed Caching
ORM: Again better but not so much

While I’m talking about RDBMS, how can we ignore ORM frameworks. If you’re an architect/application developer, you’ll no doubt be familiar with the many object-relational mapping (ORM) frameworks that have sprung up in recent years to help ease the difficulty in mapping application objects to a relational model. Again, for small systems, ORM can be a relief. But it also introduces new problems of its own, such as extended memory requirements, and it often pollutes the application code with increasingly unwieldy mapping code.

Good luck, if you trying to achieve performance, Scalability playing with above options. If I were you and I have the power to change architecture, I would choose following option. I would like to say that may be not all the applications are fit for this, but you can design most of the applications with “NoSQL” solution.

I would like to mention one Quote, I read it some where “If you can’t split it, you can’t scale it”

Here come's “NoSQL” Solution:

While vendor for these RDBMS has very little incentive to disturbt a technology generating billions of dollor for them. Companies like such as Facebook, Google and Amazon were, out of necessity, forced to invent new approaches to data management. These “NoSQL” or non-relational database technologies are a better match for the needs
of modern interactive software systems. But not every company can or should develop, maintain and support its own database technology. Building upon the pioneering research at these and other leading-edge organizations, commercial suppliers of NoSQL database technology have emerged to offer database technology purpose-built to enable the cost- effective management of data behind modern Web and mobile applications.

While implementations differ, NoSQL database management systems share a common set of characteristics:
No schema required
Data can be inserted in a NoSQL database without first defining a rigid database schema. As a corollary, the format of the data being inserted can be changed at any time, without application disruption. This provides immense application flexibility, which ultimately delivers substantial business flexibility.
Auto-sharding (sometimes called “elasticity”). A NoSQL database automatically spreads data across servers, without requiring applications to participate. Servers can be added or removed from the data layer without application downtime, with data (and I/O) automatically spread across 
the servers. Most NoSQL databases also support data replication, storing multiple copies of data across the cluster, and even across data centers, to ensure high availability and support disaster recovery. A properly managed NoSQL database system should never need to be taken offline, for any reason, supporting 24x7x365 continuous operation 
of applications.

I think for this post, I’m going to leave it here and would come back and provide some of my favorite “NoSQL” solutions.

Till then good night folks…

Thursday, July 14, 2011

Deploying Application with Cloud Foundry

In my previous post I have written about VMware Cloud Foundry, few days back I was playing with Cloud Foundry. Again, it looks very impressive and it seems Derek (Cloud Foundry CTO) team has done a fabulous job with this first PAAS offerning with Vmware. I have used and deployed my app with Google App Engine as well, and I have to say Cloud Foundry stand right there with Google App Engine and Amazon. In my view this will take lots of industry attention as it gets more mature and more and more folks will start using.

To give you an idea, it only took me 30-40 mins from the start to deploy this application. I have used VMC CLI tool to deploy this application. Spring STS also have plugin for Cloud Foundry, I might give it a try later on sometime. Here’s my first “Hello World” kind of application, which is deployed in Cloud Foundry.

http://hellovikascloudfoundry.cloudfoundry.com/

Here's my terminal output from my mac:


Vikas-Kumars-MacBook-Pro:~ Vikas$ bash

bash-3.2$ vmc target api.cloudfoundry.com

Vikas-Kumars-MacBook-Pro:~ Vikas$ bash

bash-3.2$ ruby -v

ruby 1.8.7 (2009-06-12 patchlevel 174) [universal-darwin10.0]

bash-3.2$ gem -v

1.3.5

bash-3.2$ sudo gem update --system

Updating RubyGems

Updating rubygems-update

Successfully installed rubygems-update-1.8.5

Updating RubyGems to 1.8.5

Installing RubyGems 1.8.5

RubyGems 1.8.5 installed

=== 1.8.5 / 2011-05-31

* 2 minor enhancement:

* The -u option to 'update local source cache' is official deprecated

* Remove has_rdoc deprecations from Specification.

* 2 bug fixes:

* Handle bad specs more gracefully.

* Reset any Gem paths changed in the installer.

RubyGems installed the following executables:

/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/gem

bash-3.2$ clear

bash-3.2$ sudo gem install vmc

Fetching: json_pure-1.5.3.gem (100%)

.......

bash-3.2$ vmc

bash-3.2$ vmc target api.cloudfoundry.com

Succesfully targeted to [http://api.cloudfoundry.com]

Successfully logged into [http://api.cloudfoundry.com]

bash-3.2$ vi

bash-3.2$ lS

hello.rb

bash-3.2$ vmc push

Would you like to deploy from the current directory? [Yn]: y

Application Name: hellovikasCloudFoundry

Application Deployed URL: 'hellovikasCloudFoundry.cloudfoundry.com'? y

Detected a Sinatra Application, is this correct? [Yn]: y

Memory Reservation [Default:128M] (64M, 128M, 256M, 512M, 1G or 2G)

Creating Application: OK

Would you like to bind any services to 'hellovikasCloudFoundry'? [yN]:

Uploading Application:

Checking for available resources: OK

Packing application: OK

Uploading (0K): OK

Push Status: OK

Staging Application: OK

Starting Application: OK


Friday, June 24, 2011

ESB in a Box

This title itself says it all, “ESB in a box”. Yes, that's right and its almost true for IBM XI50 appliance. Recently, I came to know about full features of this appliance and other Data Power appliances from IBM and it looks very interesting. I’m sure it will compete with already established ESB’s in market, such as Mule, WSO2, Open ESB & Jboss ESB to name few. What makes it so compelling that it can integrate with other IBM Data power family members like XC10 to provide elastic caching with ESB. Which I think is bonus and will get more n more attention from SOA industry. Lets look at some of the feature set from XI50:

Support for direct DB connectivity including IBM DB2.

Any-to-any message transformation, such as binary, text and XML messages

Support to enable simple rules with your applications.

Support data tranformation with routing capability.

Message-level XML security and fine-grained access control.

Enables extreme reliability by securing services at the network layer with advanced XML/SOAP/WS-Web services processing and policy enforcement.

Bridges to Web 2.0 technologies with JSON filtering and validation, support for REST verbs, and converting/bridging of REST and Web services.

Integrate with other IBM WebSphere Service Registry Repository for governance capability and interoperbility and connectivity.

Integarte with other Data Power family members to make it life easy for developers.

Support JAX-WS feature pack for security, message reliability etc..

I hope this will give some idea about IBM data power world...