Posts

Showing posts from December, 2014

Setting up Graphios when using OMD/Check_MK

The Graphios instructions provided by the author are accurate, but for those who need some hand-holding (like me), here's the actual process (on CentOS 6.5, OMD 1.20).  If you need help, comment and I'll see what I can do.  I'm new to this, too. Install pre-reqs Tried using pip, but still had issues with post-setup, so I just cloned from github Install Graphios cd into your cloned repo python setup.py install Edit the config file edit /etc/graphios/graphios.cfg See below because blogger just isn't great for code blocks... Configure service I didn't have much luck with the post-setup, so I did this: edit /etc/init.d/graphios Paste in the contents of: https://github.com/shawn-sterling/graphios/blob/master/init/rhel/graphios Change GRAPHIOS_USER to match your OMD sitename user I changed the log file to: /var/log/graphios.log Then changed permissions/owner on that file to OMD sitename user Probably will need to set up log rotation?... Creat

Monitoring update - it's working for us!

Image
I had a post about monitoring a while back, here's a progress update... Some ideas behind what we're trying to achieve: Gain insight into how our systems are functioning (from an application point of view, secondary from a server point of view) Know that something is broken before it breaks (minimum immediately) Show people how our systems are functioning Use data (metrics/logs/etc) to aid in trending, RCAs, etc Again, our big catch is that we run 95% Windows in our 'production' environments.  An MS shop, if you will.  Another big catch is very small budget for this sort of thing.  Another big catch is very little support from developers (who are busy working on revenue-generating projects).  Another big catch is the existing monitoring infrastructure consisted of WhatsupGold and hard-coded email alerts inside the system (i.e. we have to dump it). So...the internet provided ideas and tools, we just put them together. Data sources: Windows event log &

VMware Sns and Patches/Updates

According to VMware's EULA, you are only eligible for patches if you have a valid SnS contract. So when you purchase your ESXi licensing and only opt for a 1-year SnS because who needs support (when you can pay for single support incidents) and you get major version upgrades when you do hardware refreshes anyways...and your Dell rep just smiles and nods... When you do that, you then get to go back to your manager and tell him, oh yeah, nobody picked up on this: we now owe VMware a minimum of $many-k that must be paid in the next 30 days or we lapse our contract and then get to pay penalties on top of that money. Now, while I suspect that patch functionality won't stop working due to SnS status, the legality certainly does. Lessons: The vendors will not hold themselves accountable when something like this comes up.  It is always your fault, so view all potential expenses in that light. VMware SnS should be considered a mandatory thing, even though 'updates/patch

Intel's SSD migration tool & TrueCrypt - big gotcha

TL;DR: If you must help someone migrate from HDD to SSD, uninstall/disable any drive encryption before even starting this process. If you are going to use Intel's SSD migration software (Acronis-based) to 'help someone out', be VERY VERY VERY VERY certain that the drive is not encrypted before starting.  In retrospect a pretty noob mistake, but it does highlight something that I haven't had much exposure to (disk encryption). What happens is that the clone software modifies the MBR or something like that so you boot into the clone software rather than the OS.  TrueCrypt goes bananas and the Intel/Acronis software throws "MBR ERROR #3" and/or #2/#1. No prob, right?  Just do a bootrec fixmbr/fixboot off a Windows recovery disk and you're back to normal!  Nuh uh.  Drive is nuked.  I would note that at this point NO CLONING HAS BEEN DONE.  Simply installing the software, starting the clone wizard, and rebooting has broken the drive. The only way to rec

TFS & GO & Chef, oh my: Part 12 - Conclusion, and lessons learned

Well, the presentation went over really well, so this project is now a real thing.  I'll probably post a follow-up once the pilot is in place that will highlight the questions that needed answering before it could go ahead. If you're looking at having to go this route, some questions to consider: Why are YOU looking at solving this problem?  If it's just you, either you or everyone is doing it wrong. Is there a cross-dept team linking the project together?  Why not? What is 'DevOps' to you?  It should mean 'making things flow better'. (realizing the irony of this) Have you already picked out tools?  If so, you're doing it wrong. Do you understand exactly what you're trying to accomplish?  I mean, really?  You sure? Feel free to reach out if you have any questions.  I know these posts have been incoherent at times. Lessons learned Technical:  Using Go & Go agents to remote around means you're always kinda wondering what us