HostMonster Raw Access Logs

Anyone on here have their hosting through HostMonster? I’m pulling my hair out trying to get their raw access logs to work… I can download them and extract them without problems but when I try to read them, I see nothing but gibberish in EditPlus, Notepad; whatever I use. It’s not meaningful HEX, some form of web encoding–it’s just gibberish.

So I decided to call HostMonster and finally got a hold of someone who later in the support discussion asked whether I have a Windows or a Mac machine. I told him I’m running a Windows 7 machine and after being placed AGAIN on hold, he later came back to tell me that they were unable to replicate the issue and that everything is extracting and looking okay on their end. Without saying it, he basically implied that I was SOL and never said anything henceforth. Gee, okay.

So obviously at this point, I’m pretty frustrated. I then asked him if he tried this on one of his Windows 7 machines that I assumed they had… Guess what? They don’t have a single Windows machine. Not one! I guess I’ve been under a rock because where I come from, every IT company should have at least 1 Windows machine despite whatever stigma Windows users have within the Linux crowd. That’s like IT 101 is it not? Needless to say, I’m ready to leave HostMonster. What a joke.

In the meantime, I’d love some possible suggestions to try in opening these files. The guy from HostMonster said he was using Ubuntu on his side… So I loaded up my Ubuntu VM, attempted to open the files in both GUI and console–still no dice–and this time, when trying to use the archive manager to extract the *.GZ file, the archive manager complained about the encoding type! So it’s like I’m damned if I do and I’m damned if I don’t. I’m beginning to wonder if some hacker has pulled a slick one over on their IT staff or something…

Does anyone have any suggestions here? I tried to even use “gzip -d archive.gz” to extract the file onto the server and then view this file FROM the server… STILL NO DICE! It really makes me wonder if they’re seeing what they think they’re seeing from HostMonster when they extract and view the file… Long story short, I’ve lost my trust of that company and I’m extremely frustrated at this point.

Any help / insight / whatever is appreciated. I’d also love suggestions for another host.

A default Ubuntu might not have the decompressor for Archive Manager to read GZ, I remember having to add a load of opts to get it happy with most common file formats.

However, it sounds odd that you can’t decompress it on the server, if you have shell access, normally you should be able to see the raw logs in /var/log/httpd (or /var/log/apache)

Compression of log files is common, they take up so much disk space htat if you don’t, you’ll burn through space. Can you unzip one and give a little snippet? Might be able to see something obvious like a format for compression / encoding?

W_22,

Well, you’ve stumped me, too! Does your download file have a file extension? IMHO, it should be in .txt format (probably a .log file extension) but, if it’s compressed to facilitate downloading, try using 7Zip as it can handle the various Linux compressions and make the data available for you to read (I wouldn’t bother with NotePad, though - I used EditPad Lite but your EditPlus should do).

Regards,

DK

Tim & David.

I just logged into the server using PuTTY and judging from what I can see in the /var/log/httpd location, it’s filled with a bunch of odd stuff that just doesn’t look like what log files normally appear to be. For example, there’s 2 directories and a bunch of extension-less files. The two directories are “archive” and “fcgidsock”. The files are “cgisock.<number>”, error_log.<number> (which would normally be assumed to be the raw logs, but they’re not: they appear to be actual error logs–completely different from something like raw access logs–they’re also EMPTY anyway), and a bunch of other meaningless files that don’t appear to be log files at all–more like Linux app files that pertain to other server processes.

I tried to post what I’m seeing in these things but it wouldn’t work because when I copied the obscure code from the file I just extracted and tried to paste it inside the SitePoint WYSIWYG editor in this response, it only pasted a few characters (despite trying to use the CODE button). So I decided to attach an image of what I’m seeing instead–see below. (I hope it helps diagnose this–if you can’t view it or if it doesn’t display okay, let me know and I’ll figure something else out). I’ve also been trying to view this in ANSI encoding–which has always worked before and is my EditPlus default–but I also tried opening it in other encoding types to no avail, too.

The latest file I just extracted from the *.gz download I just downloaded was named as follows: accesslog_<website name>.com_10_13_2012.

Because of how it’s named, Windows seems to default it to a Windows Shell Extension, but I wouldn’t think that should matter if you open it in a text editor… But in any event, I decided to give it first a .txt extension and then a .log extension–both to no avail: still the same gibberish in the editor…

Since I don’t know how to get to the raw logs through shell in order to extract one, I just downloaded a .gz log to my Windows machine from CPANEL and then re-uploaded the .gz file to the server so that I could then use the server’s shell to run something like “gzip -d <filename>” to extract it on the server. When I do this, it’s the same dice: same gibberish inside both nano and vi. I’m not sure what else to do in order to get to the raw files otherwise…

I really think that either there’s something broken on their end (HostMonster’s end, I mean) or else someone is lying to me from HostMonster in order to avoid doing something they don’t want to do… I’ve read things on the internet before about how unprofessional they can be. I’m not sure if any of it’s true but obviously it has me not trusting them and thinking of the worst possible scenario. I’m a very paranoid person by default, so this has me tripping. Ha.

I’m just not sure what to make out of any of this but I’m definitely going to be moving away from them as soon as I decide on a different host (which looks like it will be either HostGator, 1&1, BlueHost, or DreamHost–which seem to be the consistent preferences across all the “Top Hosts” pages I’ve scoured through Google so far).

Any input is appreciated and thanks in advance.

I have cPanel with my host and I assume that the raw logs will be similar. Here’s what I do:

  1. Download raw log which comes in a compressed gz format to computer using web browser (running Windows Vista and Firefox).
  2. Extract the log from the gz archive using WinRAR (great little program).
  3. Change the extracted file extension from .com to .txt (.com extension has special meaning in Windows).
  4. Open text log file using Notepad, Wordpad or Notepad++.

I just tried it. Works fine.

You know what they say about assumption?

You tried my suggestion and it didn’t work for you?

Yes. The problem isn’t about extracting the file itself but instead about reading what’s in the file–which is gibberish (see image above). That’s why I said that I also tried to do it all from the server to avoid the variable of something being wrong with how I’ve used a locally-installed program or whatever. Still no dice. Every time, the log file contains gibberish… By-the-way, WinRAR is what I always use, too. :slight_smile:

I went through this same issue awhile back (about 3-5 months ago–I think back in July). I sent e-mail after e-mail to HostMonster techies asking them to help me with this. After going through what appeared to be about 2 Tier-1 techs, I finally wound up speaking to someone who seemed to know a thing or two about their servers. After speaking with him, even he couldn’t understand what was going on and eventually I gave up after they extracted one of the logs themselves. Unless I’m mistaken, I think they used something other than gzip or whatever to extract the file. Anyway, after he extracted a file, he placed it in my home directory on the server so that I could download it and read it. The odd thing is that as soon as I hung up the phone with him that very last time, I thought to myself, “…well, let’s log in just one more time and attempt to download / extract a log file just for *****-and-giggles.”

(It was driving me batty not being able to do it myself.)

So I did, and as if someone from HostMonster suddenly noticed a misconfiguration, everything downloaded, extracted, and READ perfectly fine. I was dumbfounded and completely breathless. It’s as if after going around the bend with them on this that they suddenly saw something on their end and flipped a switch or something. Strangest experience I have ever had with them… The thing that made me angry is that they never owned up to it because I sent them another e-mail explaining that it suddenly worked and they never responded (or else, I might’ve mentioned it in one of their customer service surveys that they have). Anyway, that’s why I’m paranoid in this situation…

But yeah, I did what you advised and it’s the same crap. Just a bunch of jumbled garbage in the file when I open it in ANYTHING (i.e. - NotePad, EditPlus, Nano, and vi–these last two being on their machine).

W_22,

Two things concern me:

  1. That garbage appears to be some sort of compressed data as it would appear in a text editor (which can print the odd characters). I still believe that you need to apply a decompression tool (7-Zip can decompress several different formats so you can attempt whichever ones you want until you discover the one used). However, that leads to the question as to whether you’re still experiencing the problem or it’s finally been resolved (going forward, anyway)?

  2. Your host is horrible and I would have relocated long ago. While I extoll the virtues of WebHostingBuzz, I would advise STRONGLY that you NOT use HostGator or, especially, 1&1 as they are both routinely panned in this board. I have posted in other threads my “process” for selecting a new host but, if you can’t find it, I can repost for you here, too (or merely PM you the .txt file).

There are too many hosts out there to put up with the sort of nonsense you’ve received from HostMoster. Before you leap to another one, be sure you’re not jumping from the frying pan into the fire (with the GoDaddy/1&1 class host) as you’re sure to regret it (and have to move again ASAP). Just be wary of those websites which profit from (false) recommendations.

Regards,

DK

I agree, David, that the garbage looks like compressed data being opened in an editor, but even after trying to extract it with 7zip, it still displays that junk. I’m totally at a loss here.

I’d love to know what your process is, so feel free to post a link to it. I’ll probably need it in looking for a new host soon.

W_22,

Per your request:

[indent]I offer my standard advice:

  1. Establish your requirements, i.e., Linux, Apache 2.4+, PHP 5.2+, MySQL 5+, the preferred control panel (e.g., cPanel) and storage and bandwidth requirements. Remember to allocate for log files, databases, e-mail (attachments) and growth.

  2. If you’re looking for a VPS or dedicated server, remember to ask what the host’s managed services provide. Remember, a non-managed host must be monitored by you 24/7/365!

  3. Know what control panels you are willing to use, i.e., WHM/cPanel. cPanel is the standard bearer for Linux systems and Plesk for Windows systems.

  4. Know how much CPU time/RAM you need. If you need a lot of processing power (like Zoomla and other CMS’s), this will be a major factor. These, however, are usually specified only for VPS/dedicated accounts and automatically throttled for shared/reseller accounts.

  5. Know your target (the Internet is fast but some latency could hurt so the closer your server to your target audience the better) location and try to host as close to your target as possible.

  6. SEARCH (using the above parameters) recording each feasible host as well how well it satisfies your requirements and budget. Spreadsheets are good for this as you can assign weighting to the different requirements and how well they were met to generate numerical scores.

  7. Create a shortlist based on the database you’ve created in step 5 then SEARCH for comments about the host (avoiding obvious shills and websites which advertise for that host).

  8. (from EastCoast) “Eliminate anonymous companies - if a hosting company doesn’t have a full office address and company registration details visible on their site, it’s often down to the amateur status of the operator, which is unlikely to be consistent with longevity and reliability.”

  9. (from EastCoast) “Eliminate new companies - hosting has a very high fail rate because of the low barriers to entry. If a company makes it through it’s first 5 years then it’s likely it’s jumped a few hurdles and knows what it’s doing sufficiently to have made a viable business. Not all new companies are cowboys, but the percentage is high enough that it’s not worth the risk of being the one to find out the hard way, when there are plenty other options.”

  10. Eliminate companies which do not tell you exactly what you’re getting for your money, i.e., the Control Panel, the storage, the bandwidth (traffic), the versions of the main daemons (Apache, PHP and MySQL), the SSL and dedicated IP charges, etc. That’s where knowing your requirements comes in strongly!

  11. The last step (other than selection) is to contact each shortlisted host with a question (I’ve used .htaccess and mod_rewrite availability, which services are managed by the host, the availability of IP addresses - you will require one for each SSL you use - or ask to test proprietary control panels - they may make life too difficult for you) and record the response time and your level of satisfaction with the response.

  12. Finally, you’ll have enough information to make an intelligent selection: “Just Do It!”

Been there, done that (all too frequently in the past).[/indent]

As I said earlier, though, make sure that your recommendations come from reliable sources. I’ve just gone through an installation of maldet on my dedi and WebHostingBuzz’s support team was at its usual excellent, fast level of service … on a special request! Of course, I have my “warm fuzzy” level of confidence that I have no malware on the server as well as reinforcement of my extremely high level of confidence in the WHB support team.

Regards,

DK

Thanks for the input, David. That’s some good advice… Especially the part about response time. I never thought about it in that light.

For whatever it’s worth, I found out that the raw access log archives I kept downloaded were actually double-archived! lol.

So in other words, after I would download an archive from their CPANEL, I would first extract what I had just downloaded but THEN extract THAT TOO (after renaming it to <filename>.gz).

Totally boggles my mind… Sent HostMonster an FYI about it.

W_22,

Wow! Thanks for that info! Very strange but a good data point to keep in mind.

Regards,

DK

I’ve seen that before… on certain older control panels - the server would gzip as part of its log rotate, and then the control panel would gzip it again.