Posts

Showing posts from November, 2011

Openfiler project

As part of the lab rebuild, I am setting up an Openfiler box. Some features I am using: iSCSI block-level storage with path redundancy FC block-level storage Block-level replication (potentially have a 2nd host) As I've been setting it up, I'm realizing that it uses LVM for storage which is kinda nice as it dovetails into what I've learned from the clusters at work.  Also realized that fiber channel has way more curb appeal than iSCSI.  I've not had a chance to integrate the FC switch yet, but that's on the list.  What I have done is document (thanks to the internet and some testing) the steps involved in setting up/giving new access to the target and new servers.  For the price, I'd say you can't go wrong with this setup for a home lab. Openfiler has a 4-port FC card ($80 per card) Each ESXi server has a 2-port FC card ($40/per card) FC cables from monoprice ($10/each) Now, the motherboards I am using in the ESXi servers do not have PCI-X s

Openfiler - Errno 104 (conary updateall)

If anyone is getting this error: /usr/lib64/python2.6/socket.py:381 error: [Errno 104] Connection reset by peer It is probably caused by a proxy or transparent proxy blocking access.  In my case it was my Astaro firewall - I had to add a 'Web Filtering/Exceptions' rule for the Openfiler host.  I set it to allow everything, probably safe - no browsing will be done, just the conary updates. I also found this bug on the Rpath site (even signed up to report!) from 2009, and the dev provided the proxy as a point of interest.  (  https://issues.rpath.com/browse/CNY-1958  ) Once the fix was in place, I decided to run the Update utility from the web GUI (which had also failed before), and it provided me with errors for about 5% of the packages to update, although the rest installed correctly.

Service accounts & domain admin privileges

Over the last few weeks I've had a good couple of lessons around service accounts and domain admin privileges (and who should have them).  What came to mind was a kind of cascading failure caused by not following best practices. Management team finally authorized the changes the windows admin had asked for - the removal of all 'regular' users from the Domain Admins group along with the creation of 'admin' accounts for people that required them. Users were removed from Domain Admins group.  Windows admin did not communicate this to anyone.  Management did not communicate to users that this was going to happen. Random things began to break.  Small in-house-programmed websites stopped working, workflows were disrupted, ticket queue built up, etc. After spending a lot of hours trying to figure out why these things were breaking, someone happened to mention that 'oh, admin removed domain admin privs for everyone'. Light bulb. Confirmed that each and every

Apache clerestory

As I get going with all this, it's becoming clear I should have a dedicated Apache box.  The wiki box is the obvious example, but might as well start fresh so there's no wiki nonsense buried in there.  Will also give me a chance to document the migration of sites from one server to another. Further, because we're hardcore here, we'll be doing an Apache CLOISTER.  I mean clerestory.  Cluster. From my day job I'm reasonably familiar with clusters, and I'll transfer my wiki info from there to the PTC wiki (yes this is okayed by them).  Obviously info will be sterilized, and frankly a lot will change since I'll be running through the entire process and making corrections/addendums, so there. Clusters.  Uptime.  Fo sho. I am not making this up. The more you know!

Reasons why the wiki is down:

Software issue (service crashed). Hardware issue (like CPU broken). Internet is out. Power is out. Meteor. Update:  In this case, option 3.  Out for an hour.  I should really call Bell and see what the heck.

Core i7 failure

Well, there's a first time for everything.  Actually had a CPU go bad on me.  Thought it was mobo for quite some time, didn't even consider CPU failure as an option.  Thank the Lord I had the spare lab box with another Core i7.  Little upgrade I guess...i7-920 to -950.  However, the CPU has a code that indicates 2008 manufacture, so might be out of warranty. I'll be calling Intel tomorrow, so praying it'll be covered.  If not...no cheap to replace! For anyone interested, the symptoms of the failed CPU are just this: System will no longer post, or show any video.  For all appearances it looks like a dead motherboard, or perhaps dead PSU. update: Ah, was purchased in July 2009, so should be okay. Constructive update: Decided to post my troubleshooting process just in case someone isn't sure: System BSOD'd. I noted the BSOD error and codes via photograph, just in case they would help.  No drivers were mentioned in the BSOD, so chances are it's har

Synology NFS 'access denied' and resolution

Fun time figuring this out, I was being bad and not documenting, but here's what I recall: Kept getting 'access denied by server while mounting' errors when using this command: mount -t nfs 10.0.0.14:/volume1/mysql_backup /srv/backup Checked and re-checked the Synology settings to no avail.  Thought it was something to do with root squash - was not. Correct settings should be correct IP address, RW, No Mapping, Enable Async SSH'd in to the Synology and after some messing about with /etc/exports, I set up tail -f /var/log/messages Took me a while to notice it, but the IP it was registering was the Astaro gateway IP - the Synology and my PC are on different subnets! Set the NFS rule to '*' and started working immediately. Firewalls make it easy to overlook simple things.  I imagine there is some sort of fancy NAT rule for the NFS traffic that would allow specific IPs, but seeing as how I'm technically behind two firewalls and this is a lab, the allow

Moving mysql databases into a central mysql server

Today I'm working on moving the wiki database over to the central mysql server. I still have to work through implementing best practices and whatnot, but everything is functional to this point: Wiki database is being served from the central mysql server. Central mysql server has the mysql database directory housed on a secondary disk. Documentation is available that should enable moving other applications over to the central mysql server. Yup, making progress.   http://wiki.practicaltech.ca/index.php/Mysql Next steps: mysql backups are a necessity now, so have to configure that - could involve some scope creep as I have thought about setting up a backupPC server Syslog server, also attached to central mysql server NagiosXI server, again on the central mysql server, and will also require postgresql vCenter server is on my priority list as well, but we have to wait on sorting out the hardware issues, and document the iSCSI Openfiler setup.  I've finally moved t

Quick update - mysql & SNMP

Last night I also got a mysql VM and syslog VM up.  After the wiki issue, I figured now was a good time to get the database off the wiki server and onto a dedicated mysql server.  That server will then back up onto the NAS - something that will help me sleep better at night.  The dedicated mysql box is also pretty key for a number of other Linux applications. I am not certain yet if I should be running postgresql and mysql on the same VM (even though there will be separate VMDKs for DBs, logs, temp, etc (wrong terminology??)).  At any rate, once that's ready and documented, I can really get cracking.  Can't wait to get the Nagios box online as well. Speaking of which, I've learned a lot about SNMP lately.  The process from trap to Nagios alert is quite involved: Trap to Nagios host snmptrapd receives trap snmptrapd formats trap as per snmptt snmptt logs the trap and looks up what happens snmptt runs the EXEC line which submits the result to Nagios I'd have t

Wiki down and up

It was down for two reasons: 1.  My PC is a new Win7 image and I'd forgotten to strip Windows Update of any reboot privileges.  (it rebooted, halting the VMs) 2.  When I powered the VMs back up, SElinux turned itself back on, blocking access to the mediawiki files. Lesson:  Either fix SElinux or turn it off.  Since this is internal, and I have other stuff to do, we'll turn it off.  Hosts directly facing the outside should have SElinux operational and properly configured. Now, to finish planning all of this out. I decided to move the Exchange 2010 box as it's 'sort-of' live (accepting junk mail at the moment), and it is the last running VM on the old 'big' ESX host.  Once it's moved (250+ minutes to go - thank you ESX file-transfer speed restrictions) I can install Openfiler or whatever I decide on going with and get that tuned up.  Only downside to that is I'm still using the one SAS bay in the desktop, but that's not critical, I suppose