2014 new year tech resolutions

it’s that time of the year again to contemplate, appreciate how much we have come and look forward with excitement and anticipation to what is ahead. our team is growing and so the challenge. i am learning to appreciate the intricacies of managing a diverse tech team and providing a solution that is both granular and all encompassing simultaneously. it may sound like an oxymoron though i’m sure this statement holds ground for most HIT startups involved in care coordination and the BPCI program. the truth is that when dealing with standards, data feeds should be synthesized and clean. yes, being compliant with MU1 should mean just that, but the fact of the matter is that it does not. it’s easy to point a finger at the hospitals but that is just not fair. the same is true for browsers who implement web standards. if you are involved in web development you will have to optimize and tweak your code per offering. it’s just how it is. this is where i hope that one day the state HIEs will step up to provide a more robust infrastructure that is both available and clean. the winner will be the one who will have critical mass and will do somewhat of a good job. a good job may mean a decent and somewhat relevant snapshot of the patient. and then some. HIEs can and should be smarted than their users soon. that alone will justify their existence rather than aggregators. it will be  interesting to see how  shiny pans out.

so new year tech resolutions:

1. make the switch from SVN to GIT and implement gitflow. this will allow us to get releasing more quickly.
2. BDD is not just a cool term. it’s how we will ensure quality and scale.
3. quicker more rapid and agile releases. move as close as possible to continuous integration.
4. tailor a winning hybrid schema between document storage and relational. find the balance between the two.

why we have switches to git from svn

the debate within the team has been going on for a while… should we stay with svn or should we move to turvald’s git? should we stay centralized or is it time to decentralize our version control system?

on the personal level git will allow move a bit faster with a full local history for every developer and the benefit of quickly switching between branches. as a team we saw the flexible workflows was can work with under git and really take advantage of branches as frequently as we checkout an issue on jira. though the top of the pile was the lack of merge hassle. almost every developer on the team has some fear associated with merging their code and the longer they have been working on their branch – the bigger the elephant in the room grew.

1. create a read only copy of our repository
2. pull all the changes from subversion
3. migrate our toolset away from svn to git (CI, code reviews etc)
4. make the switch

re-educating the development team is one of those harder tasks. ideally there are a few git champions within the team that can infect the rest with their passion for the switch. take under account that no matter what you do, some will resist and it will take a few months before everyone is aligned and appreciative of the change.

the necessity of software design

when designing software many aspects come into play and design plays an important role as part of the decisions made at the product level. there is a balance between effort and flexibility that is sometimes linear: quick and simple will hit a wall at some point and will most probably will not scale nor will it be easy to change key features and/or layout. these design decisions extend well beyond a choice for the right MVC backend and frontend framework, nor does it apply to the choice of the data warehouse and wether it should be relational or pure noSQL or a hybrid.

as a product one builds starts having more users interact with it and data is accumulated – requirements change and evolve constantly. staying agile in that respect makes sense, especially in the initial startup phase. in general, issues that may come up can be mitigated if the product is well designed. like everything else in the world software, there is a delicate, zen like balance between effort and efficiency.

dude where is my code?

so the users LOVE the right side bar and would like to add a couple of more navigation items. awesome. hmm… which file is that menu created? let me quickly consult with the guy who coded it… oh wait, he is no longer with the company… you know what, i can easily start the debugger and move about the code until i see something that makes sense… common practice? probably. efficient? certainly not. rule of thumb is, that it should be obvious as daylight where changes should be made and everyone maintain these rules, otherwise things get so messy that the most simple improvements are a pain. good code organization and well documented methodologies that are taught to the team and well explained are an important step

context and boundaries

so you have found the code segment where you think the changes should live. great.  before you refactor it , answer this question: are you clear on what this code is suppose to do? the expected input and output? is it clear what the code is NOT suppose to do? each code segment should live on it’s own, unit testing and all, where it’s very clear what this method is aimed to achieve. no guessing.

software fragility

nir i’ll be happy to take care of that ticket. just know that if i change that controller the other view will break and we’ll have to take care of that and then there’s this other controller… unfortunately it’s common to see software break incredibly with minor changes (especially frontend). this is why both unit and functional testing are critical, but that’s for another post.

development scalability

when one developer’s code commit breaks one function, ideally it will not impact other developers who are working on other regions of the code. ideally code integrations will include less merge conflicts and will be resolved smoothly. this is one good reason to switch to git if you haven’t done so yet.

deployment

updating your product maybe a very delicate process, especially if the model changes and you need to update the current schema and still be backward compatible to hold the current data sets for a live target.  now consider your product is white labeled to 50 customers where different version of the products are deployed to different servers, and ideally you may require to roll back or quickly update critical bugs. the complexity of maintaining these machines grows exponentially without a proper deployment strategy.

productivity

this is one important metric to wrap your heads around and be able to benchmark your team. as you build your system what is the effort required to add this specific feature? is the investment linear in terms of time and capital? does good design help people be more productive?

complexity

as the system grows the requirements will change because you get good feedback from your end users and what you thought you knew a year ago has not turned obsolete. supporting multiple and different code bases, the requirement for higher availability, redundancy, better performance and backward compatibility are all requirements that come down the pipe as your platform and client base grows.

some final thoughts… when it comes to design we now know that some prior planning and careful thinking can go a long way.

the bottom line is that the product owner needs to stay focused on what the business requirements are. the product should only solve the problems it is designed to solve so the time dedicated to development is better spent.

 

 

scrum 101

scrumscrum methodology is quite popular and i’d like to dedicate this post to making some sense out of it for those of you who are not familiar with it. if you are a development team, scrum could be an interesting fit for you or at the very minimum something new to consider and evaluate.

scrum is a software development methodology that works along the lines of lean manufacturing or agile. to make sure we all speak the same language, agile is lean is scrum in a superficial or high level view point. if you are wondering on the origin of the name, scrum is borrowed from rugby where the players lock up and try to get a hold of the ball by passing it with their feet. small iterations. makes sense?

the methodology assists in defining the development life cycle and stages, the key players and roles and how responsibility is delegated. scrum also assists in figuring out how to stay on top of project and it’s progress, how to address and perform changes/enhancements to the development plan and how to deal with risks. taking a step forward, scrum can help lead development teams, be engaged in drawing conclusions and improve both product and process on regular basis.

if you were a scrum, the world is roughly divided into you and waterfall. “you” means agile methodologies such as kanban and XP. waterfall is a more sequential approach from the 70s that was (probably) influenced by traditional manufacturing, and is similar to the approach of building a house:

  • there are restrictions on the order of operations (one cannot lay the roof before foundation for example)
  • mistakes are freaking expensive so better get it right by carefully planning and quality measuring your work
  • much repetition (many doors, many windows) so consolidating tasks means efficiency

so with waterfall one first gather all the requirements for a product, then architect the solution, then dive into a detailed, technical design of each component, code it up, integrate, test test test, repair/fix and release.

with agile, developing software is more like designing a department store:

  • usually there are loose restrictions on the order of the tasks at hand
  • a wide array of features
  • a detailed and strict planning may fail. small incremental steps and proper adjustments moving forward works better (scrum anyone?)
  • centralizing tasks helps to a certain extent

scrum in action:

  • with agile the team goes on sprints (2w minimum for us) when each run gets us one step closer to our goal
  • effectiveness is key: how many features (stories) were coded into the system. less lines of code in general, more code that does what the end user really needs
  • no elaborate MRD/PRD. with agile one maintains a backlog and direct communication. we hold daily meetings, spring planning and we retrospectively learn from our mistakes and success.
  • flat hierarchy across the team and more responsibilities is handed off to the developers. the team self manages using a structured process
  • a team is comprised with complementing skills, so each team can get stuff done on it’s own accord
so agile is more of a philosophy than rules right?
  • every activity is time measured and they are prioritized from the most important to the least. when time runs out we are hopefully better off than before we started as the product work was done priorly to make sure the most important features are on the top of the list (i.e. backlog)
  • with agile the developers are encouraged to write only what is absolutely necessary and most probably we will revisit this code later on for changes and enhancement. this is where a thought through QA process is essential

think of scrum and agile as a framework for getting the job done. depending on the dynamics of your team and size of company agile maybe what you want to implement. i think it works very well for startups and collaborating with small teams when outsourcing projects. at the end of the day agile/waterfall are all ways to increase productivity and allow developers to make the best of their time.

good luck!

healthcare and big data

everywhere you turn people are talking about big data, hadoop and sharding. rightfully so. in today’s day and age managing a lot of data is not an easy task, as performance and scalability are key. traversing large data sets, dividing them into tiny sections and distributing the load among many machines (processors) is nothing new.

hadoop exists in order to solve specific problems and has emerged out of necessity. what hadoop does is provide the infrastructure to connect multiple (cheap) servers into a coherent environment with which high i/o and cpu problems (algorithms) are solved.

it all started in 2004 when doug cutting of google released his document indexing project called lucene and decided to have it possible to achieve the same goals in distributed environment. hadoop BTW is his sun’s yellow elephant toy. in 2006 yahoo hired doug to improve his project so it can index the entire internet and made the project open source. that day marked the start of the revolution.

at it’s core hadoop includes two projects: one for distributed storage and one for distributed computing. around those two projects a vast of projects have evolved (and still are).

HDFS: hadoop distributed file system
this file system is designed to store large files and enable large and effective r/w. this is done by dividing the file into sizable chunks, while each chunk is normally stored on 3 nodes which can be anywhere. there is a “name node” that runs the mapping between a document and it’s constituent pieces and the data nodes on which they are stored on.

mapReduce:
an API to write programs that will run in a parallel.  the developer really needs to write two simple functions: map and reduce that handle a single document (i.e. element of data) on multiple machines, when each node is responsible for the timing, handling errors and failures (network, i/o, etc). this allows for simple parallel batching, where a “job tracker” synchronizes the execution of the bach processes, when each one batch is sub divided into smaller tasks which are handled by the “task trackers”.

over time yahoo and facbook (to mention a few) wrote their own drivers over HDFS and mapReduce and have shared their work with the community. so hadoop is a code name for a set of technologies who harnesses the computing power of many machines to perform simple tasks in parallel. hadoop emerged from the world of un structured data where hundreds of millions of pages are analyzed. today big data is being implemented and researched in every facet of the economy, including healthcare.