Showing posts with label Integration. Show all posts
Showing posts with label Integration. Show all posts

Saturday, March 28, 2020

Work, and Various Other Musings.

I want to talk about work for a bit, though I'll probably post something about the coronavirus soon after. I'll try to keep this from getting too technical, so bear with me. :)

My job title is "Technology Integration Engineer". I don't recall if I wrote about that before, so here's a very simplified version of what I do -

Every time you have an app that updates (Windows updates, smartphone apps that update, whatever) it generally involves many, many hours of work behind the scenes. You have programmers that code up some changes, testers to test them, etc.

Because some things change on a regular basis, there are also configuration files for just about everything. That way if you are using a specific url (as one example), when it changes you don't have to change the code, you can just update the configuration files as needed.

Since applications run on multiple machines (generally a machine for the database, which stores all your data. A machine for serving up the webpage, and various machines for other tasks - like verifying you when you log in, or validating your credit card when you pay a bill, and so on and so forth) there's also connectivity settings, messaging, and a lot of other things. It's complicated and messy -

And I actually kind of like it. I do still intend to get into computer security (infosec, cyber security, whatever you want to call it), but it's kind of fun to troubleshoot why the heck something isn't working. I'm getting reasonably good at reading logs and trying to figure out what they mean.

Now, tech is a fast-paced industry, and it seems to me that tech companies that have been around for a while are always in the midst of some sort of transition. There's older, legacy systems - and since they work, and most of the bugs have been sorted out, they don't generally want to replace them unless they have to - and then there's newer technology, and it all makes a rather confusing mess. One of the big changes going on is less about the tech (though that plays a role), and more about how the company is organized.

See, in the older style of software development, you might put out new code one a month, or once a quarter. The developers develop, the testers test, and my job is to help integrate the code with all those various configuration settings so that the environment works. Sometimes that means fixing issues when a new build (i.e. the code changes for whatever version they're working on), and sometimes it means trying to help troubleshoot some defect that the testers have identified.

But once a month is... slow. Faster, more agile companies can push out changes daily... or even faster. Which is what one of the current industry buzzwords is - agile.

As someone who graduated a little over a year ago, I had heard the term but didn't initially have a clue what it really meant - but last Feb I was in Dallas for some corporate training, and we discussed it in more detail.

The biggest takeway, for me at least, is that in order to be more agile they essentially want to restructure how we do things. Instead of having a dev team, and an integration team (mine), with database administrators and the like, and testers... they essentially put those skills all on a small team dedicated to one particular application. That way they can code it up, run it, test it, and push it out in a timely fashion.

Which made me wonder what was going to happen in my own company, because it means my current job would go away. (Maybe. Supposedly the transition can take five years or so).

Now, I want to circle back to something that happened early in my career here. Before I learned enough to actually be busy fixing things, I got sent to some training that discussed the concepts layed out by Google for SRE. I then got tagged to be on a scrum team (which is what they call those small teams in the agile framework, iirc), but tbh I hadn't really been able to do much with it. Most of it was very different from what I do in integration, and the project I found most relevant to my day job was when we tried looking at file system cleanup.

But now that I've learned enough to be more useful, and don't have to spend most of my time dealing with the day-to-day integration issues (we each take turns being 'on call' as the primary point of contact for the days issues, and as a trainee I'd been handling more and more of the non-production issues. Now, though, I should be rotating when I'm on call... so about one week a month I should be busy with that, and the other three weeks I can start working on other things.)

Last week was the first time I had the chance to really dig into some of that, and it sort of amuses me. Because even though I was hired as an integration engineer, and even though I learned java for school, it turns out that I'm apparently also a bit of a python developer now.

Sort of. Still spend most of my time in integration, ofc. It kind of amuses me, as someone who learned an entirely different language. Like, yes... you too can wind up programming in an entirely different language, even if you weren't hired as a programmer in the first place.

The last few months have been interesting. I like learning new things, but I have had to learn soooooo much! It's right at the edge of my comfort zone, tbh. And there's all sorts of things that I feel I only barely understand. (The number of times I've been asked to help fix something, and have no clue what the acronym they're using means, or what it really does for our application... well. I can see why imposter syndrome is such a thing in our industry. I now have a little bit of an understanding of containers, and microservices, and mq, and a bunch of other enterprise level technology... and a bit of sql and Oracle, plus some Couchbase and nosql. Only superficial knowledge, I'd say... barely scratched the surface on a lot of it, but def something that keeps me on my toes.)

There's other parts to the job, non-tech related. Some days we're slammed with issues, all of which they claim are 'Severe 1, high priority, must be fixed right away.' Which, I mean, yeah... we'll all do the best we can. But the true top priority is production, since there are real live people who would get pretty darn mad if things break. If they couldn't pay their cell phone bill, or get a new phone, or whatever.

And the training environments are also pretty important, since our customers have people who are trying to learn the system... and they can't if that system isn't working.

Which is why, as someone still relatively new, my focus has been on non-production. If I screw something up there, it won't have as big of an impact.

But the customers often act as though these testing environments are as important as production (Severe 1, high priority. Fix it now!), and there's only so many of us to do the fixing. Which means I also have to learn how to multi-task, and prioritize, and try to keep the customers happy while also keeping some semblance of work-life balance.

Supposedly, I think as part of that work-life balance our bosses and clients got together and agreed that we wouldn't support outside of working hours, but the clients don't often act like that... and want us to work all hours of the day. (I had been a bit miffed when one of them called on Christmas Eve - Christmas Eve!!! - for one of those issues. It wasn't production, it wasn't the training environment, and what the heck are they doing testing on Christmas Eve!)

I feel a bit guilty saying that. Like, there is so much pressure, maybe mostly self-imposed?, to try and get it done, fix things, and keep the customers happy. But... nobody is getting shot at. It's not production. Especially now, with the coronavirus and everything else going on, I wish I could tell them that they're are more important things they ought to be doing right now.

Like spending time with family.

Whatever. I don't really think I'm wrong, I think I feel guilty for two reasons. One, because I can do it, and it prob wouldn't take all that much time, but I'm putting my foot down because you have to draw the line *somewhere*. It becomes too easy to say, sure... let me knock that out. And then you wind up staying an hour or so after work, and dealing with non-production issues on Saturday and Sunday, and skipping lunch, and no. Not unless it's a true priority.

Secondly, because it feels like most businesses (no matter what they say) don't like people acting as though there's more to life than work,  so I wonder whether or not my company has my back on this.

Except they mostly do? Sort of? I'm on call again this week, and just got a call about an hour or so ago for another one of these things, and basically said "we're not supporting you right now, feel free to escalate to my boss"... and I haven't got a call yet, or been told I'm in the wrong.

It's Saturday. What are the consequences of not testing, right now? A slight delay in releasing to production? Is that really that big of a problem?

Smh. (This touches on something I may or may not get into as I talk about coronavirus, since there's been this whole question about the impact of social distancing on the economy, but I'll get to that if or when I get to it.) 

Saturday, August 3, 2019

Job Update - Part II, Business Applications

Earlier I wrote a series of articles discussing what happens when we connect to a website, and I used an analogy of the post office to describe how messages get routed. I talked a little bit about what goes on at the business side of things, and now I want to go into much greater detail.

Let's say you want to shop online, or transfer funds, or any of the zillion things we now do over the internet.

You open a browser on your phone, tablet, laptop or desktop and connect to a URL. In my previous series of posts I described how this gets translated into a series of messages that get routed to the 'front office' of a business, which then sends your information on to their fulfillment center or distribution center for processing.

There's a bit more to it than that. You see, the business will have one machine (or building) that responds with the webpage you requested, but in order to fulfill your request it needs to know a few things. Like your login info, and whether you're authenticated as the person authorized to view your account info. Then it needs to find your particular information (out of all the other people who have accounts there) and let you see yours, and yours alone. Plus there has to be a method for adding new customers, or removing old ones, and getting your billing information, and more. 

So one machine may be dedicated to offering up the requested web page, and another machine may handle authenticating login information, and still another machine may hold the database with all your order history or transaction history, and still another may be secured more tightly because it holds everyone's billing information, and so on and so forth.

But wait, there's more!

If the business is reasonably large, it may have thousands or millions of people interacting with their websites on any given day. So they need a way of handling all that traffic. PLUS, people get pretty upset when a service isn't available. If they want to order something, or pay a bill, or whatever they want to do it Right. Now. and they aren't going to be pleased if your website crashes.

So businesses need redundancy, because you as an individual may survive if your hard drive crashes, but a business might not. So there are ways of having two or three machines acting like one web server, so that if one fails the other two can pick up the slack. And there are things called 'load balancers', that help make sure that all that traffic gets routed to the servers in an even manner. Otherwise, one server might get so overwhelmed that responds to requests more slowly, while another is sitting idly by.

But if you have two or three machines doing the same thing, you also have to make sure they're synchronized and share the same data sources. So instead of having a hard drive on one machine, you'll probably store everything a shared Storage Area Network (SAN), which will also have built in redundancy so that if one of the drives fails the data can still be recovered.

Oh, and you also have to worry about making sure that transactions happen once per request, and only once. That is, if something crashes while a request is being made you have to make sure that the request finishes processing (or doesn't process at all, undoing anything done before the failure). That way you don't get charged twice for the same order or something.

All of which sounds like a lot, when you think about it. But businesses are aware of all this and most of them have figured out how to make it happen (even if that sometimes involves outsourcing services, like using a cloud provider to manage your machines.)

Anyways. Three years ago if you had asked me what an application was, I'd probably have said it was something like Word for Windows, or Pokemon Go. You download some sort of file (most likely an .exe file, since it's an executable), install it, and it does stuff.

And learning to code meant I was much more aware of how complicated creating that .exe was. I mean, the Windows operating system has something like 50 million lines of code. Trying to figure out how all that fits together would be insane.

I knew that there had to be a way to allow multiple people to work together on a program that large, and I'd heard about things like GitHub, and understand the importance of version control. After all, I had the joy of trying to figure out why a change in one part of my program broke something in another part, and that was all just me. Trying to manage the efforts of five or ten different people all working on different parts of the program at the same time? That requires a good supporting structure in terms of tools (like GitHub), division of labor (who is responsible for which parts of the program), and procedures for deciding when something gets accepted into the official program.

Anyways, I've come to realize that at the business side of things 'application' refers to much more than just the lines of code that get compiled into an executable. Especially since more and more businesses offer up their applications on websites, which has several advantages. (i.e. the user just has to remember the website. The business can update the application as desired, and the client doesn't have to download and install any of the updates... they'll see the changes when they go to that URL. My company apparently used to offer a .exe program that our customers downloaded and installed, but I believe we stopped doing that and now offer it as a web application.)

Which means that, from the business side at least, an 'application' refers to more than just the code that goes into it. It also refers to the various machines required in order to make the website work, to include the databases, the processing of requests, and more.

And we're still not quite at what I'm doing.

See, businesses need a process for developing and maintaining their application. Something like the Software Development Process. There's apparently a lot of different ways of doing this, and you can follow the link to explore more on that. Most have some variation on the basics - i.e. figuring out what the requirements are, coming up with the code to do it, then testing, testing, and more testing before finally releasing it into production. And really, that last stage might be considered the final test, as anyone who's had to deal with bugs after an update can attest.

Each of those phases need their own version of the application. They might not get the heavy traffic that the official application does, so they may not need the load balancers and multiple servers, but they still need a web server, database, machine for authenticating users, etc. In other words, you need to duplicate the entire environment.

Not only that, but occasionally issues come up with the current (live) version. The one in production. Maybe a vulnerability was discovered, whether in the business's program or a third party's software used by the business, and a patch needs to be applied. Maybe an issue came up after the latest version was released into production. Plus if you've divided up the labor, you may need one environment focusing on a particular part of the application (like billing, or the website), and another environment focusing on something else. Whatever the reason, you need to have multiple environments for every stage of development.

And this is where I come in. My official job title is "Technical Integration Engineer". We've got, I dunno... maybe 40+ environments involved. Each with at least four or five different applications. I say I don't know because some of them have been retired or aren't currently in use, so I can't really say how many there are altogether.

Each of them have to deal with reboots and what we call a 'build push',  rebooting because (as you may have experienced) rebooting can clear out old data that causes errors and bugs otherwise, so it's good to regularly reboot your machines. And a build push? Well, if you've made some changes and want to test them, you have to incorporate them into your software build and then push them into the environment for testing. Then you can try doing all the actions a user would and see if it works or not.

I'm still very much at the beginner stage of my job, so right now most of it is about dealing with any sorts of issues with rebooting or pushing a build. It also means monitoring how much memory we're using and clearing out some of the older and more obsolete files if we start running out of space.

It often means working at the 'back end', that is... if someone in one of these environments is having a problem accessing a web page or performing an action, I'm checking logs or running scripts in the shell environment. Luckily, my predecessors have already created a bunch of scripts for our most common tasks. Mostly I'm just learning my way around, learning where to go to find the logs or scripts for which application in which environment, and what to do when one of the gajillion alerts comes up. (I learned about something called 'alert fatigue', and I think my organization really suffers from it. I've also spent a bit of time coming up with a system for my Outlook e-mail that I think is satisfactory. I already knew about creating rules to sort e-mail, but we get waaaaaayyyy too many of those for me to rely on. So I simplified it down and created some rules dividing things by environment, and sending all the automated e-mails to a couple of folders. Then I created some search folders so I can easily find any of those specific alerts or reports. Outlook was annoyingly unhelpful at doing some of the things I wanted to do, I'd love to place some of those Search folders near the folders related to whatever the alert was... and given that I repeatedly saw other people having the same wishes when I looked online for solutions, I think it's a pretty common desire... but I'm going to guess there'd be some complicated coding involved in doing so. Anyways... I put the ones the search folders I know how to address up in my favorites, so I can easily see when something new comes up.)

There's a lot more, of course. I'm still learning what various alerts and messages mean, and I'm sure I'll eventually be updating and/or writing my own scripts. I spend most of my time on the command line (or, well, with Linux it's the terminal for a Bash or Korn shell) and I'm getting pretty good at running commands like 'ps -ef | grep <xxxx>' to find whatever current processes have <xxxx> going on.

I think I can safely say that applications are a lot more than just the .exe file. That it contains the machines, third party software for synchronization or whatever, database queries, scripts, logs, and various methods for monitoring and alerting when issues develop... all of that, for each of the many, many environments...

And the back end is a complicated, complicated place.