Louis: Hello and welcome to another episode of The Site Point Podcast. We’re back this week with a news and commentary show. With me are 2 of the regular panelists, Kevin Dees and Patrick O’Keefe. Hi, guys.
Kevin: Howdy, howdy.
Patrick: We’re back.
Louis: Yeah. It hasn’t been, you know, terribly long since you were here on the last show. Steven unfortunately is away this week. He’s sick. So we’ll hope he feels a little bit better and he can get back with us next time.
Kevin: Get well.
Louis: So let’s dive straight into this week’s news. Patrick. You had a couple of stories there.
Patrick: Yeah. I do have a couple of stories. So Firehost is a cloud hosting provider. They’re renowned for their secure cloud host and they really focus on security and they have published the results of a statistical analysis of 15 million cyber attacks that were blocked from their servers in the U.S. and Europe during the 3rd quarter of 2012.
What they found, I see they categorized the attacks into 4 different categories. Those were as follows: Cross-site scripting, cross-site request forgery, directory traversal and SQL injection. Of those 4 categories, the cross site scripting attacks far and away were the leader and most importantly, they grew by an estimated 69%, those 2 types of attacks.
Cross-site scripting represented 35%, cross-site request forgery 29%,
directory traversal 24% and SQL injection was just 12%. If you don’t know what those are, you’re like me and you’re really a Layman, the XSS attacks, the cross-site scripting attacks involve web application gathering malicious data from a user via a trusted site, often coming in the form of a hyperlink containing malicious content.
Then the CSRF or the cross-site request forgery attacks exploit the trust the user has for a particular site. Those 2 attacks are far and away the most prevalent, likewise, the United States was the most prevalent as far as the origination of the attacks. 74% or 11 million came from the US. There was a shift though with the 2nd place country which in this quarter was Europe, or I shouldn’t say country, but Europe was 17% of all malicious attacks whereas Southern Asia was 6%. They have previously been the second place leader.
This is the part where I just kick it over to you guys to talk about the importance of sanitizing and whatever words you use.
Louis: Yeah. Obviously, this is interesting in a few ways. I think the fact that SQL injection represents such a small percentage of the overall attacks, perhaps it’s sort of indicative of sort of development and touring and that they’re probably very few websites out there that are actually vulnerable to SQL injection because it’s one of the first things that you’ll learn if you’re learning any kind of back ground programming is to sanitize data base input.
You know, I think that it shows that perhaps, it’s not really as serious an attack factor anymore because so many people are able to defend against it pretty easily and almost every sort of data based connection library you’ll use or object relation or map you’ll use will sort of automatically handle escaping SQL as it goes into the database.
So it’s pretty unlikely and a lot of cases that you’d wind up with SQL injection vulnerabilities. Although obviously, if whatever, I don’t know, they said there were 11 million… they blocked 11 million and there were 12% of those were SQL injections and it’s definitely not gone away.
Patrick: Yeah. The actual total was 15 million. And just to add on to your point, is it sort of like, I don’t know if you want to call it a consolidation but it seems like for all the different content types of people share, more and more are there are all these prevalent applications like WordPress or blogging and CMS and the other options that are available.
Whereas in the olden days, there is more custom solutions that were programmed just for that website. Now you have these tools that people are really focused on and as such, the people behind those tools and the communities behind them are really focused on these areas and they’re really thoroughly vetted and as more and more people up for those commonly solutions, SQL injection as a problem goes down. Is that fair?
Louis: I guess you might be able to make that argument. Although, the issue then is that for example, for something like WordPress which has a very open plugin architecture…
Louis: It’s extremely easy for some random plugin to have an SQL injection vulnerability because all the plugin has to do is try and be stupid and hit the database directly or take user input from a form and can catenate that directly into a database query and boom, you have a vulnerability.
While that is probably true of people using it as an vanilla word press install or any of these other systems, you know, Magento, Free Commerce or Drupal over for figure CMS type sites, it’s not true as soon as you start adding in these plugins that are written by developers that have maybe less discipline, less oversight, there are fewer people working on the project that’s not as tested as the overall framework is.
Obviously, it’s something that if you have an application, you really need to be paying attention to. And if you’re using any kind of plugin and WordPress don’t just sort of, you know, go out there and grab the first thing that comes along.
If you are familiar with a bit of how these attacks works, and you should be as a developer, pop out in the code files of the plugin and just have a quick look. In the case of SQL injection, it’s really obvious because database input has to be sanitized.
In the case of XSS as well, that’s just the case of looking at any kind of user input is sanitized either in the way in or the way out for script tags for example. There are some credence to the fact that did the prevalence of some larger systems might have had an impact on decreasing some of the more common attacks. It’s maybe not as true as you’d first think because there are all these plugins.
Kevin: Yeah. I would have to agree with that. I mean, it all comes down to the developer that’s building something, right? Whether it’s you or somebody else. Like with WordPress for example, with the use of the Vanilla Install, sure, you’re fairly secure. But as soon as you start using another developer’s code, like a plugin, right, you won’t have the assurance that you need unless you know that developer or you’ve seen their code. It’s kind of this trust trade kind of thing.
Patrick: And just to show where these types of, you know, attacks have grown from numbers wise. Q2 versus Q3. Directory traversal was the number for Q2 with 43%. It’s now 24%. Cross-site scripting, 27% goes to 35%. SQL injection, 21% falls to 12%. Then the big gainer is really the cross-site request forgery which went from 9% to 29% and another anomaly here is that stats that the other 3, all are originating for the most part from North America but for some reason, CSR is coming from Europe. I don’t know what the Europeans are doing with the cross-site request forgery but obviously something is going on there.
Louis: Right. Yeah. I mean, CSR is interesting because it’s a little bit trickier and it requires that you really have some way of validating. Ruby on Rails for example has request forgery protection built in but that is not something that’s obvious necessarily if you’re building your own system. I think it’s not as frequently covered in a lot of introductory materials.
Whereas, you know, telling a newbie who’s learning PHP that he has to escape anything that goes to that database and that he has to escape HTML output that’s being… that was user generated and it’s being sent to the user’s browser, explaining how to prevent CSRF is something that I don’t see as often in most introductory texts or lessons, but maybe as this becomes more prevalent and a more widespread attack vector, it’s quite possible that that will change.
Kevin: Yeah. I think one of the things that has always kind of brings up is just making yourself more knowledgeable, specially on the subject of plugins and that kind of thing. There’s this directory of exploits that you can go to. It’s called I think exploit-db.com. Yeah. That’s it. Basically, you can go here and you can just look up any kind of exploit you want. Maybe one for WordPress or Wordplus plugins.
Usually the more popular things are on there. If that’s not enough for you, there are a number of tool kits and the flavors of Linux that you can use. I personally used Backtrack Linux to do pin testing for my own sites whenever security is of importance. It has a lot of cool tools in those things that you can check out to kind of help you along the way.
As you begin to learn, you can kind of look into one tool, right. XSS attacks maybe, and you can do some pin testing for that and that kind of thing. So there are a number of places you can go to check these things out but I think those 2 are pretty good starting points. Do you use anything for that or do you do pin testing yourself?
Louis: Yeah. In my case because we’ve got a fairly… I don’t want to say we have a big team. There are like 5 of us but our Dev OP CIS admin guy is really focused on the security side of our applications. On the developer side for those of us who just write application code day in and day out and I’m one of those. Obviously, you keep in mind these things but, again, the types of frameworks we’re using, we rarely if ever write SQL queries directly.
Almost all of that is handled through the ORM. Outputting HTML obviously putting it through escape but they’re on to a lot of situations in our application where we allow users to enter HTML and then have it. Everything does get escape.
Yeah, it’s something you definitely have to be aware of but in terms of actually testing it, that’s not something that I take care of directly.
Kevin: I’m kind of in the same boat. I’ll boot up… if you want to find, it’s backtrack-linux.org. I’ll boot that up every once in a while. Just to kind of see like if I have a questionable plugin or something like that, you know.
Louis: Yeah. I guess, you know, if you’re using WordPress with a lot of plugins or if you’re… you know, I think it’s more of a concern for people that either have an entirely custom map where they don’t have the benefit of a big open source community like WordPress or Rails or Drupal with have or they’re using plugins from, you know, smaller single developers or anything like that.
That’s not to say that if you’re using just Vanilla WordPress and you’re keeping it entirely up to date or likewise, if you’re using, you know, Rails or any Python library and keeping it up to date, thatyou’ll be safe. But it’s harder to make a rookie mistake if you’ve got a good level of code behind you. I have not seen this backtrack-linux.
Kevin: It’s awesome.
Louis: Do you want to explain like a little bit how it works? How does having a different distribution help you to do the penetration testing?
Kevin: Yeah. Backtrack Linux is basically just, I guess it’s a flavor of Ubuntu. I’m not even sure specifically about the technical aspects of it. But what you do is you install and again, it gives you the command line and you have boot into the GUI from the command line from there. Once you to have that up, it’s basically just a ton of prebuilt tools that help you to pin testing that are all out there. It’s really just gobs and gobs of these open source tools that are out there and I mean, personally, I’m not a pin tester.
I just go through the menus, find the ones I want to run or I do Google against the Backtrack and find out which tools I need for certain things, just to kind of test. Overall, it’s really nice because it gives you the user interface to kind of go through and pick the things that you want and test the things that you want to.
On top of that, like because they’re all in there, you can kind go through and see oh, here’s a new tool that I don’t know about. Let me check that out, do a little bit of investigation. I use it to learn mostly, just to kind of see hey, here’s something new that I didn’t know. Also, to utilize the ease of use that it has for me, you know. Just to be able to turn it on and go and share something.
Louis: Very nice. Something to, definitely something to have to look at. Cool. So it’s interesting and the story didn’t come across my filter. I was just going this morning to check out what’s new on Readwrite Web and I noticed that they have launched a new site redesign. I thought that’s interesting to talk about. Not so much the business aspect but from a technical point of view, it’s pretty interesting.
It is, first of all, it’s a fully responsive site, which I guess is to be expected from a new technology site redesign at this point in time. I think I’d be disappointed if it wasn’t. But there are a few other interesting technical tidbits. One of them, I don’t know if you guys have noticed this but the way the sidebar scrolls is done so that sidebar sort of scrolls differentially from the main content, if you’re scrolling down the main content.
I think it’s set up so that when you get to the end of the article, you also get to the end of the sidebar, which I think is really cool because a lot of time you’ll have a sidebar that’s really shorter than your main content and as you use the scroll, suddenly there’s nothing. There’s this big white space off to the right or the left. And as they read through the content this way, there’s always something there if you want to, and you know, makes your ads a little bit more visible if you rely on advertising or makes your additional content or additional options a little more visible to the user. I thought that was pretty cool.
The other thing I thought was interesting about this is if you look at the blog post written by the new editor in chief of Readwrite.com, so it actually also changes names but I’ll get to that in a second. And in this blog post, he mentions that the new design was designed sort of with a tablet first approach, which I guess is kind of unique because you hear a lot of people talk about mobile first as an approach to design or you sort of start with a bare bone of mobile view. That gives you sort of the ability to not have to worry about sort of, ‘OK, we’ve got this desktop centered website. Now we have to cram everything down into a mobile view’.
Instead, you start with a bare bones, then you sort of add on features.
We’ve spoken about this several times on the show before that it provides sort of the ability to have a really fast and lightweight mobile experience and it only add what you need. But in this case, taking a tablet first approach is really interesting. I think for a news site, I think they’ve, you know, I don’t know whether it came from their site statistics. Whether they see, 30% or 40% or 50% of our users are accessing it on sort of a medium sized touch screen.
We’ll focus for that. I don’t know. What do you guys think about tablet first as an approach for design for something like this where it’s content and news and that’s exactly the kind of thing that I think most people do to tend to consume on tablet devices.
Kevin: I feel like this was a CEO decision. I just get that kind of instincts. Like, I have this iPad and I want it to look nice. That’s what I would do if I was a CEO. I’m not lying either. That’s what I would do. Make it work on my device.
Patrick: Yeah. I mean there’s a number of changes that have taken place since it’s been acquired by ‘Say Media’ which happened in December. The rebranding was announced a while back. I remember they’re mentioning that some change from ReadWriteWeb to Read White, and now have a new design and the new editor in chief, Richard McMan has just departed and Richard is …
I met him a couple of times in conferences. Really nice guy. Always very approachable and obviously ReadWriteWeb is a very influential tech publication.
He created something with a unique voice and it’s kind of sad to see him go but obviously everything comes to an end. The new editor in chief is actually Dan Lyons who used to be the technology editor for Newsweek and might be best well known as ‘Fake Steve Jobs’. And he’s made the rounds for many different publications. So…
Louis: I did not realized that he was ‘Fake Steve Jobs’.
Kevin: Yes. This is the one, the only Dan Lyons. As far as my thoughts for the design, I mean it’s nice, it’s clean. I can see the merit of the tablet first approach as you described it. I did noticed that scrolling thing. That’s kind of the first thing I noticed, was how it was scrolling differently and you know, it doesn’t bother me. I don’t know that I love it. It’s just there. It’s nice. It’s very clean versus the old ReadWriteWeb.
The advertising is, ReadwriteWeb, I don’t know. I’m trying to think back to the last it’s on but they were… I don’t think they were as heavy as a lot of other sites. And this is kind of particularly add light. Yeah. I mean I don’t know what else to say about it. It’s very clean. It is very appealing and I hope it works out for them. You know, I don’t wish them anything but the best.
It’s always kind of weird to see those kinds of brands that are well established with the web and the tech, like ReadWriteWeb could just change name and change domain and I know that they put in all the necessary time picking all those redirects in place. Redirect all the old links and the old stories to go to Readwrite.com. It will be interesting to see how it changes under Lyons.
Louis: Yeah. Just a bit more from a technical standpoint. One of other things you’ll notice as you just go around the site and click around the site is that it appears that almost all of the content is loaded in with Ajax. So if you hit a link, you’ll get the frame of the page load straight away and then a little bit of the Ajax spinner as the content gets loaded in with Ajax which is interesting.
Now, that was an approach that was also taken as I think one that you mentioned we’re talking about before this show in the Gawker Media Redesigns. There are a few differences here. One of the differences is that obviously ReadWriteWeb or ReadWrite.com is not using sort of hashbang URLs. So the URL is just a regular URL. It’s year/month/date/title slug.
One thing I do notice always, that this site feels to me a little bit slower than I would expect. I don’t if you guys are having the same sort of experience. It feels like I’m used to things being a little bit more snappy, clicking through an article for example feels like it’s less than optimal to a moment and I’m sure there’s a lot of optimizations that will be possible. It just seems like they’ve really gone all out on the design and the features of the site and everything really works gorgeously. Like you said, the redirect work which is something that Gawker didn’t do so well on.
It feels like in terms of the speed, there’s definitely something lacking and I think maybe that speaks to an important factor when doing this kind of redesign, you really have to consider the speed as a feature and, you know, it feels like there’s a definite decrease in quality there from the older site. I don’t know if you guys are experiencing that as well.
Patrick: Yeah. Actually, I am experiencing what you’re talking about I think. I click link “A”, not so impatient that I just like, “Oh well, screw this site. I’m leaving.” But it is there. I clicked, 1, 2, 3, 4, I’m actually waiting right now. It just loaded. That was just to load one article.
Louis: It’s definitely a several second load time.
Patrick: Yeah. I see the spinner thing. I don’t actually get a frame myself. I get the spinner thing until pretty much everything loads at the same time.
Patrick: I’m just looking at the white screen with that spinner thing until that moment.
Louis: Yeah. So, you know. That’s one of the things I kind of wanted to talk about because in this case, I know you want to be critical of the team that was responsible for this redesign. It’s a fantastic piece of work obviously.
Patrick: We hate them and it’s ugly. Go ahead, next.
Louis: Obviously, it’s a very pretty site. I like the lay out. It gives you this big image on the front screen. Obviously, the responsive notice of it is fantastic as well. It looks great on a small screen. But, when doing this kind of redesign, let’s just say for the sake of argument that from clicking a link to seeing the article or to be able to start reading the article on the old site was, you know, 1.5 or 2 seconds. Then after the redesign, it’s 4 and a half or 5 seconds.
Patrick: Yeah. Just some that you can add to that point is did you know if Gawker is still doing this exactly because their site seem to function a little differently than I remember when that whole thing first went down.
Louis: Yeah. So their weird scrolling, like when the… I think it’s actually come back to more of a normal website style thing now, a little bit. No, they still do the weird scrolling. OK.
Patrick: OK. Do they, OK. It kind of all scrolls at once. I’m not sure what… it’s like different now and there’s no like hashbang thing going on and I don’t know.
Louis: No. They’ve dropped hashbangs. It used to be that the stuff was just systematically broken. If I would hit a, you know, especially going from mobile because there’s a number of redirects going on because there’s an Australian version of Lifehacker for example. It’s like Lifehacker.com.au. Then furthermore, there was a mobile version. Someone had posted a link to the mobile version or to a full story on Twitter, I click it on my phone and it tries to redirect me to the mobile then redirect me to the Australian version and essentially, I wind up on a home page and the article was just gone.
It would happen pretty frequently to me in the months after the redesign where you would click on a link in Google and you get a 404 because the redirects were just not in place or that the hashbang URLs weren’t working. Whereas, that’s definitely not the case anymore and, you know, back button works and everything you’d expect.
Patrick: OK. Cool.
Maybe it’s in progress and it wasn’t quite ready to go and we will see something like that in the near future. But it just interesting to me that for a redesign of this otherwise fantastic quality, things like that and it could be such a significant decrease in the performance of the app.
Kevin: You know what. Now that I think about it, I was like I’ve seen this before somewhere where they tried to fade it in to make it seem elegant and I remember Google did that a while back and they no longer do it.
Louis: Oh yeah. Right. That’s sort of, yeah, yeah. When the content first loads, it’s not like boom, here it is.
Kevin: Are you talking about the Google search, like as it adds things to the page as you type more words? Well, no. The Google homepage at one point, they had set it up to where it was just a Google logo and the search bar.
Patrick: Oh, right. Kevin: Then when you move the mouse, everything would fade in, including like the navigation and all.
Kevin: And they’ve removed that. Looking back on, it was quite annoying that Google is doing that. It just felt, I don’t know. It just had a weird feel to it like, I don’t know.
Louis: We’ve been talking about this for long enough that I am going to actually do some real testing here.
Kevin: I’m doing a page speed insights with Chrome or for Chrome.
Patrick: I’m spinning back and forth on my chair, waiting for your results.
Louis: OK. There we go. Now I just got content on the screen at, I believe that was either somewhere in the area of 17 to 19 seconds from a hard refresh that I saw the content on the second. Going back in the timeline, so the initial HTML request, well, there’s a lot of latency there. It takes about half a second for the server to respond to the initial request. Yeah. A lot of this is latency.
Even stuff like the, the actual download time is pretty limited. The CSS file is a 9 kilobyte GF file which is fine. That’s nice and small. But there’s a 200 millisecond latency before the server even responds to the request to serve it to me.
Kevin: Looks like a lot of the things that I’m seeing too are like responsive images.
Kevin: Yeah. I’m looking through the Google page speed and they’re serving up the…
Louis: So that time it took 14 seconds.
Kevin: We are working this down, aren’t we? They’re serving up the Twitter.com widgets.html twice and pulling that in twice. I don’t know if that’s due to separate pieces of content. Meeting different things. I don’t know. But that’s a significant… you know, it’s 20 kilobytes.
Louis: Well, yeah. I mean it’s not so much that the raw size of most of the things here that I find disturbing. It seems like the server takes a long time to come back with it. So that time was… everything was a little bit faster that time. I still didn’t see content on the screen until about 12 seconds in but I was looking at more along the lines of… we’re now still a few milliseconds of latency for every request including just static files.
Kevin: Yeah. It doesn’t even look like their files are minified. I’m looking at CSS and it’s broken up in different lines. I mean there are some basic things that you can do to avoid, especially with a site like ReadWriteWeb. It’s the little things that add up when you have a site of this scale. And if you don’t get those right, you may get the larger items right but, you know, if you’re not compressing and saving every little bite, it can make a big difference specially when you’re trying to serve out a page initially and a bunch of people are coming to it.
Kevin: No. This is of course is just a feeling but I feel like a lot of these issues are coming up from the responsiveness of it. Like it needing to be responsive. I feel like a lot of these things that appear to be overlooked may have been those hours, right. When you think about project, everybody has hours internally or externally. I can feel like those hours that they needed to optimize this were eaten by the responsive team if that makes any sense. Like, because there’s a trade offer to make. You have to launch something on time.
Louis: Yeah. I mean obviously there’s a lot of time. Yeah.
Kevin: I don’t know. That’s just been my experience with the projects working externally with companies and also internally as, you make a trade off because you want to launch a feature because you have, you know, press releases time for all these things. And so you make commitments to get the word about something that you’re doing and you have all these articles written for specific date times and so you have to launch it, right.
Kevin: And with responsive, I mean they have like, 3 different versions of the site. I feel like, I think it will speed up overtime, is what I’m saying. I feel like they just haven’t had the budgeted amount of time to launch this.
Louis: Yeah. I mean, like I said. There’s a ton of things here. So just, like you said, just looking at the page speed and bytes, there’s definitely a few things that are straight up, sort of easy fixes. Like I said, the latency serving, the static files seems much higher than it would be if they were served by something that was a lot more light weighted. Feels like it might be, you know, trying to load up all app stack if it’s hitting either the PHP upper rails to serve the statics because again, you know, where was I, I can’t even remember the numbers I was talking about there.
Kevin: You know, Amazon went down today. I wonder if they’re having to switch servers maybe because they’d be using… I don’t know but I’m just saying like Amazon did go down today. So…
Louis: Amazon did go down today.
Kevin: That could potentially be playing into this.
Louis: That’s entirely possible. So maybe we’re in… maybe we’re being entirely unfair here. Right. So I don’t want to dwell too long or probably too late for that, on these perceived performance issues because all in all again, it is a gorgeous redesign. I think it does a lot of things really well. Obviously it’s a big step in the direction of responsive taking over the internet and I think that’s really good. I think it’s great to see.
And even if it doesn’t work for whatever reason, I think it’s good to have big case studies out there of sites that have gone different approaches. You know. Like we’ve talked about the Gawker media redesign and with the impact that that had on sort of using Ajax as instead of navigation and how it played out and I think it had a lot to do with the fact that virtually no sites do take that hashbang approach anymore.
So one way or another, it’s going to be really interesting to see how this plays out but obviously it’s great to see a big project embracing new technologies and trying things differently because it means that, you know, if you’re a shop trying to do responsive, you have this another case study that you can point clients to. And so “Hey, this is the sort of thing that we can do and this is how we can adapt to all the screen sizes that are out there.” So big congrats to the team, you know, all our minor rivals aside.
So, perhaps, Patrick. You had another story.
Patrick: Negative SEO has been a bit of a buzz phrase here for a little while. And if you don’t know what it is, negative SEO is the thought that competitors or people who want to hurt your searches and rankings would buy links on other sites or acquire links on sites that are not good websites. So that Google views it as bad links, bad websites.
And just kind of a phenomena was created because Google made a switch where they started to treat bad links, links and bad neighborhoods as for their own distinction as negative votes against your site rather than just ignoring them as they had done previously.
So this whole negative SEO thing came about and to address this concern, Google has launched a new tool. It’s called the disavow links tool. And it kind of like the whole lot of word aside disavow but disavow is the word they chose. And this was announced at the Pubcon Conference by the head of the webspam team, Matt Cutts.
And as you can imagine, the tool is pretty straight forward. You can specify particular links, URLs, domains and so on and so forth, URLs and a text file and then Google will probably ignore them and treat them as if they were a no follow up link.
Now, I say probably because Google has said that Google, they reserve the right to trust their own judgment. But for the most part, they’ll typically use the indication you provide when they asses links. And Cutts it can take weeks for any changes to go into effect.
Now, this article that I’m bringing from here is from Danny Sullivan at Searchengine Land. And Sullivan asked Cutts about why doing the disavow links thing versus just ignoring links as they had in the past and they kind of ignored that and focused on the benefits of the feature.
One of which he said was people who have purchased a site or come across a website that has links from bad sites and now they want to create a “clean slate”. So, something for people who buy sites on Flipem, maybe you have a site that has a bad link in.
Louis: Thanks. Thanks for the plug there.
Patrick: And it allows you to kind of clear that out. So, yeah. I mean that, in a nut shell is what this is about links to what will I’m going to do, will I get to ignore the effect of these “bad links” and hopefully negate the idea or the thought that someone would use negative SEO against you. Now, I kind of turn this around and, you know, Google is obviously very smart. They have a lot of smart people working there. I’m not questioning the all mighty Google per se. But, if you can consider, I mean if this is really even necessary. I mean do they really need to treat links as negative votes. Couldn’t they just ignore them? What do you guys think about this?
Louis: So it’s easy to sort of to look at it at that way but I think if you think back, and I don’t exactly remember when it was but I think there was a point in time and I want to say it would have been around 2 years ago when web spam was really at its worst, when almost any search that you did on Google for just, you know, sort of an informational search, the top results would be these useless filler content websites with tons of ads and pop ups and it was, it was just so much junk.
And to be fair, in my personal experience at least, in terms of doing the searches, I find that Google has gotten a lot better in the last year. And if part of that was, you know, taking people who are doing really over the top SEOs and things are over optimized or that were, you know, link farmed and penalizing those sites rather than simply ignoring the bad and letting them, you know.
Because if you’re spamming the internet with links and links and links and links, some of them might be in bad neighborhoods but some of them might be in good neighborhood and some of them might not be flagged as bot links or some of them might. So if you just take a scatter shot approach and just put links to your site absolutely everywhere and some of them happen to be good and some of them happen to be bad, just ignoring the bad ones doesn’t help to reduce the quality of your thing or to reduce your ranking.
So what Google wanted to do, I’m guessing, is just take, well if the person is just spamming the entire internet with links back to their site, not only is that an indication that we should ignore some of the links that are in bad neighborhood. It’s an indication that this person is behaving unethically in their attempts to rank their website and even though we can’t necessarily algorithmically identify the links that appear to be in good neighborhoods as bad links, they probably are bad links.
They were probably acquired either maliciously or paid for. And so, you have to kind of off set that. So I think it really did a good job. I don’t know if you guys had the same experience but it felt like for the past few years have gotten really, a lot better in terms of web span.
Patrick: Right. I once think that like Sullivan’s point though is that what they did is they created negative SEO, right. They created the idea of it because what they’re saying is they can identify the bad links, algorithmically.
Louis: No. I’m saying you can identify some bad links. But let’s say if I have a website and I want to rank it, right? So I go out and put, you know, links all over the internet. I use the malicious script to inject the links in some places. I buy links from Linkfarms. I do everything I can to generate the most amount of links, right? Even if 90% of those can be identified as spammy links. If 10% of them are good, that still might put me ahead of a legitimate competitor who would rank higher based on equality of their contents and the equality of their links.
But because, you know, through this scatter shot approach of bombarding the entire internet with links, I managed to get in there that Google’s algorithms couldn’t catch, I still win. The point of devaluating those negative links is to make it so that If I’ve got, you know, 90% of obviously of spammy links and 10% at good links, I shouldn’t just have that 10% for free.
There should be a cost to the fact that I’ve got this 90% spamming links. And then obviously negative SEO is a consequence of that. One of the things that I think it might get missed here is that, you know, whereas the idea of going out and getting links and, you know, getting people talking about your content is something that any content creator or business owner can understand. The idea of using this link as a valid tool, the idea of negative SEO is pretty complicated.
So if one of your competitor is engaging in these kind of unethical tactics, it means you actually have to go out and find out about how this had happened and all that. I think Google might be, maybe a little bit out of… do you feel like that’s a bit out of touch, like do you feel like it’s maybe out of reach from most people who have websites if they are the victims of negative SEO to know how use this link as a valid tool?
Patrick: Well, to me, I think that’s a fair point and I just wonder about kind of the process behind this because it’s one of those cases where like we can talk about negative SEO but I can’t talk to anyone in my family about that. And some of them run a business that have websites, small business and they don’t… that doesn’t register with them. That’s not a thing.
So it’s tricky where if you have that one person in a market who knows what that is, and is unethical and they used that to their advantage, then how is the people even going to know, right. Because I don’t know a whole lot about this because I haven’t read about some of the things like a legal warning which Sullivan wrote about. I guess there’s some notification but he kind of alluded to not being very helpful to people. And yeah, I mean that’s kind of my concern here because I think I guess there’s a trade off there because like you said, and I don’t know how much I really…
It’s Google’s index. They could do what they want. I just don’t know how much I really trust Google identifying this link as bad, this link as good. But that’s what they’re going to do. So I have to live with it. I see the trade off there because you can penalize for the bad links. Let’s assume they’re all bad. Google’s algorithm is 100% right. In that case then, that’s good. They do penalize people who have those bad links. They’ll allow people to rise up. I mean the trade off is then being that it will be used negatively. It’s not if, it’s will be. There’s no doubt about that.
And then a lot of people who will be used against won’t have any idea, won’t be able to see and receive the warnings and as such will just be impacted. How much will it impact their business, who knows? I mean obviously we’ve heard businesses that have lived or died by the swing of the Google results. But yeah, I mean there’s a trade off for everything.
But to me, I think it’s worth discussing, it’s worth thinking about kind of the negative impact of this but I’m sure Google has done that and it’s good to have a tool to disavow those links. I just don’t know exactly how I’m going to be aware.
Louis: Yeah. I think, well, obviously it’s something that web developers should be aware of and I think we’re maybe reaching a point now where Google has provided enough tools and enough information where your everyday kind of, you know, small design development shop can have a little bit of just basic SEO knowledge of knowing how to get your…. and it’s not about SEO like in terms of, you know, going out and getting links or, you know, doing keywords research and figuring how to rank.
It’s just a matter of doing all the basic stuff that you need to get your site the best chance that it can and that means, you know, claiming your local listing in Google local search and local search and other search engines as well. And it probably also includes using things like webmaster tools to find out what keywords you’re ranking for and maybe buying ads for those keywords or… and it also will now include, you know, being aware of are there links out there that according to my website, that are negative and how can I do a good job of eliminating that.
So it seems like some kind of SEO services that aren’t what we traditionally think of when we think of SEO services. I’m not, you know, I’m not at all talking about. Even if you do nothing in terms of keyword research or crafting keyword for this content or link generation, if you don’t do any of that but what you do is those basic steps of, you know, keeping your profile with the search engines clean and focused on the important links that are legitimate. I think that’s a really important service for a lot of smaller web developers or small design agencies to offer to clients because it doesn’t, you don’t need to know a lot about search engine optimization and how search algorithm is worked to do that.
You don’t need to be an SEO specialist. You can be a web designer. A developer does WordPress stuff and as part of setting up your client’s site, you register the local search results. You set up a Google Webmaster tools account and start monitoring these things and just, you know, provide that as add on now. It’s something that you can potentially also charge on an ongoing basis for if it’s, you know, something that you would say “We’ll monitor links and make sure that nobody’s doing any kind of spammy attacks on your search engine rankings in the future.”
And it’s something, I think it’s a very white hat approach to doing SEO for clients and I think it can be really good for developers to have that toolbelt. So, it’s good for Google to provide these tools for, maybe if they’re not usable by website owners, they’re at least usable by the people who built the websites.
Patrick: Yeah. But you realize how much of a scam that sounds like to an average person like Google creates the problem. So they offer tools that other people can charge the unknowing public to use. It’s essentially to create the problem, create the market and create an extra cost for the average small business. So I mean, I’m just… I’m not saying I absolutely feel that way but you know, you really did explain that out like that, I mean that’s how a lot of people will realistically and honestly fairly interpret it.
Louis: Yeah. But to be fair, that problem already existed. That problem existed since the dawn of search engines, right. Before Google did anything to combat web spam, any small web development could say we’ll also provide, you know, get you to the top of Google.
Louis: And that’s it. It’s the same spam. Except now, what they would do to do that is like I said, buy links, farm links and that reduced the overall quality of the experience for Google’s primary costumer which are the searchers, not to be blowing the websites. So, now I feel like the service that a web designer would be providing is less spammy than it was in a sense that now, what you should be doing is, you know, playing by the rules.
If you notice that there’s something wrong, go through Google’s channels to fix it, make sure that everything’s nice and tidy and well organized and you’ve got good quality links but don’t go out there and farm them or buy them. So it feels like that’s a more stand up thing to be doing. All you’re doing is kind of maintaining a good relationship with Google on behalf of your client.
And that’s not the same as spammy links in my opinion. I thinks, yeah. I understand it can sound that way if the way, you know, depending how you sell it to your clients but to me, it sounds more legitimate than the situation that we had 5 or 10 years ago with SEO, sort of spammy SEO companies, cheating clients and the same time just creating a huge spam problem for the rest of the internet.
Kevin: At the end of the day, if you don’t like what Google is doing, you just switch to Bing.
Patrick: Right. I mean we can all say that, but you know. Yeah.
Kevin: It’s just that the issue is my e-mail is tied to Google and my documents are tied to Google and…
Louis: And my maps and my phone and…
Patrick: And for a lot of web master web site owners, they represent a majority of the revenue and certainly majority of the traffic. So it’s a powerful stick.
Louis: Yeah. Certainly.
Louis: Yeah. Let’s do it.
Kevin: Let’s do it.
Louis: Cool. I’ll go first. My spotlight this week is a little tool that, one of my coworkers just happened to cross, I think on Hacker News a couple of days ago, and forwarded it through to me and I thought It looked really cool. So I will, I’ll share it with you today. It is a tool called JQ which is a little bit confusing because you might think it sounds like jQuery or jqtouch or anything like that but it’s nothing like that at all. It is a command line tool. So along the lines of grap or sed or any of those, you know just a very lightweight command line tools that anyone works from a Linux command line will be familiar with and what it does is it processes Jaison from the command line. It’s written in portable “C”.
So it has no dependencies. It’s a single binary. You put it in your bin directory and it just works on basically any Unix machine I assume. I haven’t tried it on OSx but we’ll definitely work on that next. And what it allows you to do is just if you go to the tutorial page, you can get an example. So you know, you might have had the experience of, for example you want to look on API that returns Jays on and you want to just get an idea for what kind of thing it returns.
And sometimes, if you need to do auth or if you need to do include a key or if you need to do a post request rather than just a ‘get’, it’s not exactly as convenient to run that just through your browser or if you’re going to get a lot of data and you want to be able to search through it and just, or filter it down. Again, using your browser can be really awkward. But in this case, you know, you can pass URL that you’re targeting to curl and then pipe that through JQ and either get all of the results which get really nicely formatted.
There’s none of those sort of backslash escaped slashes, everything’s all nicely printed out and formatted or you can just use a sort of a selector style functions to get just a subset of a result. So if you only want for example one particular node in the Jaison, you can just pass through some filtering arguments and get access to that. So it seems like for things like investing, Jaison API are working with Jaison in either a shell script or just playing around in the command line. It seems like a really powerful tool. So I thought people could check that out. It is at stedolan.github.com/jq. I’ll put the link in the show notes.
Kevin: Very cool. So my spot light for today is on Jonathanbrooke.com and he writes a blogpost about testing and debugging media queries using CSS. It’s kind of a cool little technique. It’s basically CSS pseudo elements and he just kind of uses it before and after elements of course and just post the message to the top of the screen saying what size the browser is. It’s kind of like an interesting little technique.
Louis: Cool. I will definitely check that out.
Kevin: Now, I send you the link!
Patrick: All right.
Louis: Go ahead.
Patrick: And my spotlight is a blog of sort hosted on WordPress.com. It’s at fakedrpepper.wordpress.com. And I recently came across this and if you’re familiar with Dr. Pepper, the soda, you might also know that there are of knock offs of Dr. Pepper. And so this blog is called ‘Not quite what the doctor ordered’. And all the post were May of 2008. So I’m not sure if it’s just sort of, you know, database or library sort of website here.
But no matter, that fact that it’s 4 years old, it’s still from his point of read through. Because you get all just a bunch of the Dr. Pepper knock offs that are out there, what they look like and the different names and, you know, I just found it funny.
I was recently doing something with some Dr. Pepper knock offs and I came across Dr. Perky at Food Lion. And I really thought, you know, Dr. Perky. I don’t know if I would trust Dr. Perky, but, you know, everyone has to choose their own doctors. So yeah. This is kind of a quirky little side if you, especially if you like Soda. But yeah. Have a look through all the different store brands and knock offs that are out there or at least a great chunk of them or the Dr. Pepper brand.
Louis: Dr. Thunder.
Patrick: Yeah. Dr. Thunder. That’s the Walmart brand. I’ve actually, that’s something… one of the ones I had.
Louis: Dr. Springtime.
Patrick: Rocky Top Dr. Thunder, or Dr. Topper. Sorry. Rocky Top Dr. Topper. That’s weird.
Kevin: I mean there’s a lot of doctors out there and there also this sort of thing where some of them use a period after the R, some don’t like Dr. Pepper doesn’t but then some of the doctors use the period.
Kevin: Right. I don’t know how that stands grammatically but it does differentiate them to some extent.
Louis: Is this actually a thing? It’s like, forgive me from my foreigner question. But in the United States, there’s some kind of law where Cherry Cola has to have the name Doctor.
Patrick: No. And you know, Cherry Cola isn’t really what Dr. Pepper’s even viewed as over… I mean, Dr. Pepper kind of markets themself as this unique spiced drink, kind of the spiced genre but yeah. I mean Mr. Pib actually call themselves I think as a spiced cherry… do they use the term cola or not? I don’t know. But yeah. No. It’s not a law, it’s not a rule. Actually, the funny thing about it…
Louis: Obviously, I was joking.
Patrick: That, was that, you know, Dr. Pepper started in 1880s and Mr. Pib which was the Coca Cola brand competitor started in the early 70s, 1970s. And they started as Pepo, but Dr. Pepper sued them and they ended up with Mr. Pip. Later Coke tried to buy them but were are blocked for antitrust, saying that there might be a monopoly of the quote Pepper softdrinks. So as you could see, it’s actually pretty serious business, the pepper drinks.
Louis: Right. Because the base is still, it still taste mostly like Cherry Cola, right? There’s an extra something in there but…
Patrick: Yeah. I mean I think for Dr. Pepper, it says there was 27 unique flavors in Dr. Pepper. But obviously different people taste different things.
Louis: All right. Well, the Sitepoint Podcast, your number source of information for the web design development and spiced colas. All right. Take it away.
"What makes a great CTO?" Engineering skills? Business savvy? An innate tendency to channel a mythical creature (ahem, unicorn)? All of the above? Discover the top traits of the most successful CTOs in this free guide.