Giving birth to a software

There are a few exciting moments for a software developer when he builds a software (and love doing it). First “clear” run, 1M customers using it without a problem, … For me the most exciting moment is the “birth”. At first there is nothing, nothing material – just some thoughts in the people’s heads. And then you create a folder, name it, add some files and run it. And it runs. It does nothing really, but it runs! There was nothing, and now there is something. This “something” is small, useless and ugly, but you see the future beauty of it. You see people who will use it, love it (or dislike). It will solve their problems and make their life easier.

But it’s still just yours at this moment. And then you commit. And boom, it’s part of the world, not just yours anymore. Even if it’s only your partner or code repository who knows about it. It’s out “there”.

And although I do it every day for some “exploration” projects, it’s different. That doesn’t bring that excitement. But starting an application that will “live” is like planting a tree or breaking ground for a new home. I may not see the tree grow or build the most of the home, but it’s all about starting it. Yes, I love to see a working product in the hands of the people. It feels good. But starting it …

I see quite often auto-generated comment in the code “Created by …“. I never like it. But when you commit the code, you have to give your name. And I know that there are a few products around that have my name at the root. And it brings a happy smile to my face 🙂.

A few day ago a new tree was planted …

Better TODO comment

Here is a simple trick that makes TODO comments more efficient. You have seen TODO comment like this all over the code:
//TODO: convert to prepared statement.
var query = 'SELECT processedDateTime FROM ProcessedEvents WHERE hash = ?'
 
Yes, it’s good to have it. This is a way to track a technical debt. And here is how it can be much more useful:
 
//TODO: vvs p2 convert to prepared statement.
var query = 'SELECT processedDateTime FROM ProcessedEvents WHERE hash = ?'
 
Just 2 small pieces are added:
vvs – your initials,
p2 – priority (p1, p2, p3)
 
Now, you can search the whole code base for your own TODOs (or TODOs created by a specific person). And when you see a comment like this there is no need to go to Annotate or Blame to find out who has left it (save time).
 
When you leave a TODO comment you are in the best state to analyze the impact of the incomplete work. That’s why putting a priority at that moment is so crucial.
 
It’s a simple trick that can save you tons of time, especially when you manage a team and do daily reviews of all commits.

Easy mocking in Node.js

I’m a big believer in unit testing. Not only because it allows me easy code refactoring and faster regression testing, but because I can deliver product much faster doing unit tests. Yes, writing more code (unit tests) shortens product delivery time. But this is another topic.
 
Efficient unit testing is not possible without mocking. In Node.js I was a happy user of rewire. I liked it. Until today. For whatever reason it stopped working for me. Either it doesn’t like me anymore or something wrong with me, but it refuses rewire modules.
 
So, I’ve spent some time looking for another mocking framework. There is a number of good ones around. But all of them didn’t feel 100% right. Especially when I compare them with framework available for C# or Java that I’m used to.
 
So, can mocking be easily and nicely done without framework? Sure, the same way I have never used any IoT/DI frameworks with C# – always used simple manual injection.
 
And it’s actually very easy with Node.js.
 
In the following code, I need to test create function. There are 2 calls that need to be mocked for unit tests: inboxDb.create and inboxProcessor.processEvent. Here is a manual mocking with  just a few lines of code – check the mock setters (t the bottom of the code.
 
'use strict';
var moduleName = 'inbox.controller'

var Promise = require('bluebird'),
appRoot = require('app-root-path')

var inboxDb = require(appRoot + '/server/components/inboxProcessor/inbox.db'),
MessageModel = require(appRoot + '/server/api/messaeg/message.model'),
inboxProcessor = require(appRoot + '/server/components/inboxProcessor/inboxProcessor')

/**
* Processes incoming message.
* @param req
* @param res
*/
exports.create = function (req, res) {
var payload = req.body.m
var ipAddress = req.connection.remoteAddress;
console.log('%s|create|payload:%s', moduleName, payload)

var message = {}
inboxDb.create(payload, ipAddress).
then(function(result) {
try {
message = MessageModel.parsePayload(payload, ipAddress)
return Promise.resolve(message)
} catch(error) {
console.log('%s|create|parsePayload|Error: ', moduleName, error)
return res.send(204, 'poison')
}
}).then(function(event) {
return inboxProcessor.processEvent(event, payload, undefined)
}).then(function(result) {
return res.json(201, result)
}).
catch(function(error) {
console.log('%s|create|Error: ', moduleName, error)
return res.send(500, error)
})
}

/* Mock setters */
exports.setInboxDb = function(mock) {
inboxDb = mock
}
exports.setInboxProcessor = function(mock) {
inboxProcessor = mock
}
 
Now, with this setters in place, mocking is actually very easy in the unit test:
'use strict';

var Promise = require('bluebird'),
should = require('should'),
request = require('supertest'),
appRoot = require('app-root-path')

var app = require(appRoot + '/server/app'),
target = require('./inbox.controller')

describe('POST /api/inbox', function () {

it('should process message', function (done) {
target.setInboxDb(setDbMock('ok', false))
target.setInboxProcessor(setInboxProcessorMock('ok', false))

var data = { "m": "message - actual content is not important"}
request(app)
.post('/api/inbox')
.send(data)
.expect(201)
.expect('Content-Type', /json/)
.end(function (err, res) {
if (err) return done(err)
res.body.should.equal('ok')
done()
})
})

})

function setDbMock(result, throwError) {
var mock = {
create: function(payload, ipAddress) {
if (throwError) {
var error = {message: 'db error'}
return Promise.reject(error)
}

return Promise.resolve(result)
}
}
return mock
}

function setInboxProcessorMock(result, throwError) {
var mock = {
processEvent: function(event, payload, next) {
if (throwError) {
var error = {message: 'error'}
return Promise.reject(error)
}

return Promise.resolve(result)
}
}
return mock}
 
By replacing actual invocation with our mocked functions, the code is easier to test and debug (in this case I’ve just created a wrapper that allows me to return some result or throw an error):
target.setInboxDb(setDbMock('ok', false))
target.setInboxProcessor(setInboxProcessorMock('ok', false))
 
Yes, it’s minimalistic and doesn’t have all the bells and whistles of other frameworks. But, it’s simple and will cover 80% of the mocking needs.
 
Update. I’ve opened another set of unit tests with rewire today and it worked flawlessly. I’m not sure why the first set didn’t work, but I’m glad that I’ve found this simple solution and will stick with it for now.
 

Data analysis with Hadoop and Hive. New tools. Old game.

I’ve done a lot of data analysis in my life because more than half of my career I was a part of an actual business rather than ISV. I’ve worked for commodity exchange broker, wholesale and retail companies, Coca-Cola, Bayer and other businesses where I did a lot of analytical work on top of software development. I’ve done my first data analysis project in late 80s with Lotus 1-2-3 when circumstances forced me to pause software development and sell jeans (BTW, that was a great experience at the end – learn how to make a sale). And since then I’ve learned the power of databases and spreadsheets that allowed me to dig the data and help businesses grow.
 
Recently I’ve done another data analysis project with new tools – Hadoop and Hive. I’ve been working with them for the last 2+ years, but more from a development perspective. This was the first time I did actual data digging with them. And just an hour into the analysis I understood that although these are new tools, the actual process is the same. Yes, they provide some structural (sets, arrays, maps) and scalability advantages, but one needs the same data analytical state of mind to get the answers from the data.  And the fact that you have this power doesn’t mean that you have to use it right away. 
 

Getting answers from the data is an iterative process – you write query, analyst the result, optimize your queries and repeat the process. And that’s when you want to move fast – before you make The Query that will give you the ultimate answer. This fastness to find the path is especially important with large data sets and Hadoop latency – start small, move fast until you are really ready to “Release the Kraken”.

 
So, I think that Big Data analysis has not a lot to do with Hadoop & Co, well … unless you are on the operation side. It’s still more about how to dig the data and find a way to answer the question fast before forming it into a repeatable process suitable for the tools.
 
And that’s why learning how to use spreadsheets to analyze data is so important – it teaches how to find the answers. 

Software Structural Quality, or Lessons from Open Source Projects

Software quality is important. Very important. Not a lot of people will argue with this. That’s why we have “tools” like QA and TDD – to ensure that we deliver a high quality software. But most of these “tools” address functional quality of the software. But what about the structural quality? Isn’t it important as well? How often will you see a bug like this: “Method xxxxx is difficult to read and it’s too complex”? Maybe structural quality is not that important and is primarily a topic of software craftsmanship?
 
Software structural quality refers to how it meets non-functional requirements that support the delivery of the functional requirements, such as robustness or maintainability, the degree to which the software was produced correctly.” If the software has 0 bugs (what a noble goal), but is neither easily readable nor maintainable, is it good? “Hey, man, it works”. Yes, no doubt about that. How does that help when new features should be added or existing code should be changed? Unless code has a high structural quality it cannot be easily changed and it will take more and more time to deliver new features. Also, eventually it will effect functional quality, because changing something, even something very small in a very bad code can lead to an unpredictable behavior (read: bugs).
 
So, if a company would like to stay in a business and release new versions of their software, then sooner or later they should take care about “internal” quality. The sooner the better, before it’s too late. How may times have you heard something like this “I’m not touching that code – nobody knows what will happen”.
 
Then why do so many companies produce software that works, but is ugly inside? One of my ex colleagues once said: “If our customers could see our code, they would never buy our application”.
 
Hey, it works. Don’t you get it?”. Yeah, I get it.
NewImage
I’ve been in many companies, but I have not seen high structural quality in code very often. That’s why it’s even more surprising when you look at the quality of the popular open source projects. Most of them are very good – easy to read & understand, nicely structured and covered with multiple unit tests. Shouldn’t it be an opposite – as my job I’ll write a high quality code, but I can be easy on myself with this for-fun projects? Funny, but it doesn’t work this way, because it’s impossible to add crap to an open source project without it being seen and accepted by everybody. But it’s absolutely possible at work – as long as I have 0 bugs, I can put whatever crap I want, nobody will see it anyway. “Don’t you have a code review process?”. Yeah, right – if I contribute crap, why would I reject your bad code.
 
So, what can open source projects teach us the about structural quality of software? 
 
First of all, there should be somebody who cares. Who cares about good code. Who will be a pioneer of “let’s be proud of what we write”.
 
Second, create a culture of good code. Yes, it’s possible to setup a process, but unless the team believes in good code, there is always a way to cheat the process.
 
Have you ever tried to open a PC for upgrade? Yeah. You know what I’m talking about. Have you tried open Mac? Mac Pro is beautiful inside. That was the way Steve Jobs wanted it – the product should be beautiful even inside.
 
And BTW, writing good code is much more fun. So, have fun 🙂 and be proud of what you write.
 
Update: Chris Hume volunteered to edit this post and make it better. Thanks, Chris.
 

What I like about C#

Recently I’ve been asked what features I like most in C#. I’ve been working with it for almost 14 years now (since the end of 2000), and today I use it almost daily along with Java and JavaScript (in Node.js). It’s so fun to witnesses the growth of the language – from the birth to teenage age and to maturity.

 C# is a very nice language and I find the following features of it most useful:

– Generics brings the concept of type parameters. This makes code very clean especially when dealing with collections.

– Linq – added standard and easy-learning pattern for querying and updating data. It’s like SQL for C# collections. Linq allows developers to express what should be done instead of how (forget about for, while loops). I especially like to use it in the form of Lambda Expression. Although .NET framework has a functional language – F# – the ability to do functional programming in C# is great. 

– Parallel Processing – 

– Tasks. If you’ve ever done threading in C# (or any other language for that matter) the simplicity that Task provides in dealing with threading is a life saver. There is a reason why Microsoft recommends that we use Tasks instead of Thread/ThreadPool directly – not only it will work better, but code can be written much faster and cleaner.

– async / await

– dynamics. There is a reason why dynamically typed languages (Ruby, Python, JavaScript) are so popular. If you have never experience before it feels awkward at first, but then you get it and have a blast. I was working with dynamic language since early 90th, and having this functionality as part of C# makes life much easier in some cases.

With these features been around for a while it sometimes amuses me how often I still see that engineers don’t use them. Benefits

So you want to be a software architect. Part 3

In the second post I’ve shared my thoughts about the most important skill that architect needs if he wants to see the materialization of the design. Today we will go over the ability to identify the risk and commitment to the priorities.

Identify risk

As you design the application or subsystem, it’s important to see what the risky parts are. Before you hand the design over to the development team, you must be sure that risk and unknowns were investigated and resolved.

Let’s take a look at an example. In order to speed up the development of the new versions of the product, you need to transition from a monolithic architecture to a Service-Oriented Architecture (SOA). You know how to split your business functionality into multiple independent services. You addressed the issue of internal routing and load balancing. But what method should be used to communicate among the services? There are many approaches. SOAP will be an obvious choice. But this will require your team to learn new skills of building SOAP-based services and operation team to deploy & maintain them. Maybe it’s not a big deal, of course. But you consider to build internal SOA using AJAX/Json services because your team is already familiar with this approach building Web application and operation knows how to deploy and operate them. But what about the performance? It can be slower than other SOAP approaches. Unless you measure and compare performance of both approaches, and than agree with the team that the performance “penalties” are acceptable because of the advantages that the team will utilize the existing skills, you will be risking the transition to the new architecture and delivering it on-time and with high quality. In this case you should measure first and build then, not build first and then test and measure:

  1. Identify the risk
  2. Investigate it
  3. Find alternatives
  4. Agree with the team about the approach
  5. Do it

Risk may present itself in many areas – team skills, complexity or stability of the 3rd party components that you plan to use, performance of the algorithms, scalability path and so on. By “realizing’ the new architecture in your head, identifying the risk as part of the design and addressing it in the earlier stages, you will be able to deliver the design that will be successfully implemented by the team. 

Commitment to priorities

Each new project or release has priorities, Usually they are:

  • time
  • quality
  • scope
  • resources

Priorities are defined by the management and driven by the business needs. And if they are not defined, then you should ask for them. Very often the resources are fixed – your team has the same number of people. Out of the other 3 you can balance only 2 and have to sacrifice one of them. For example: with limited time, it’s not possible to deliver all the features with high quality. And it’s easy to forget about the priority #1, if let’s say it’s time, by creating a perfect design or design an interesting feature (both take time). You have to deliver according to these priorities, even if you disagree with them sometimes. 

(To be continued)