<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>CapeLinks Blog &#187; Web Development</title>
	<atom:link href="http://www.capelinks.com/blog/category/web-development/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.capelinks.com/blog</link>
	<description>Online Marketing, Advertising, Internet Strategy &#38; Other Stuff</description>
	<lastBuildDate>Tue, 04 Oct 2011 19:16:28 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Fighting Member Registration Spam in ExpressionEngine</title>
		<link>http://www.capelinks.com/blog/fighting-member-registration-spam-expressionengine/</link>
		<comments>http://www.capelinks.com/blog/fighting-member-registration-spam-expressionengine/#comments</comments>
		<pubDate>Tue, 11 May 2010 14:52:50 +0000</pubDate>
		<dc:creator>CapeLinks</dc:creator>
				<category><![CDATA[Web Development]]></category>
		<category><![CDATA[ExpressionEngine]]></category>

		<guid isPermaLink="false">http://www.capelinks.com/blog/?p=616</guid>
		<description><![CDATA[<p>Member registration spam has become an issue in running ExpressionEngine websites. I have been using EE since it was called pMachine and up until a few months ago, member spam was not really a widespread problem.</p> <p>ExpressionEngine has grown in popularity over the last few years (especially since the release of the free core version) and now that enough people are using it, EE has become a target for spammers.</p> <p>There have been many solutions to combat member registration spam posted in the EE forums and most recently in the ExpressionEngine blog.</p> <p>These methods include changing the member profile trigger [...]<p><a href="http://www.capelinks.com/blog/fighting-member-registration-spam-expressionengine/">Fighting Member Registration Spam in ExpressionEngine</a> is from <a href="http://www.capelinks.com/blog/">CapeLinks Internet Marketing Blog</a> located on <a href="http://www.capelinks.com/">Cape Cod</a>.</p>
]]></description>
			<content:encoded><![CDATA[<p>Member registration spam has become an issue in running ExpressionEngine websites. I have been using EE <a href="http://capelinks.net/about/internet/expression-engine/">since it was called pMachine</a> and up until a few months ago, member spam was not really a widespread problem.</p>
<p>ExpressionEngine has grown in popularity over the last few years (especially since the release of the free core version) and now that enough people are using it, <strong>EE has become a target for spammers</strong>.</p>
<p>There have been many solutions to combat member registration spam posted in the EE forums and most <a href="http://expressionengine.com/blog/entry/fighting_registration_spam/">recently in the ExpressionEngine blog</a>.</p>
<p>These methods include changing the member profile trigger word, advanced captcha, etc&#8230;</p>
<p>First, changing the profile trigger word is not going to work for long, unless you change it every few days. All changing the trigger word is going to do is cause a failed registration in the automated spamming software which is run off a list of sites that is fed into it. As soon as the list of urls to spam is updated (usually via a Google search &#8211; see below), you will start getting spam registrations again.</p>
<p>While this may throw off the spammers temporarily, it is not a very good long term solution. Why?</p>
<h2>It&#8217;s the footprint stupid</h2>
<p>The best way to combat member registration spam is to <strong>remove the footprint</strong>. This means you need to <strong>remove any reference to ExpressionEngine in all your templates</strong>, especially in your forum and member registration templates.</p>
<p>Spammers target sites to spam <strong>by using searches to extract lists of sites to target</strong>. Take this simple search for example:</p>
<p><a href="http://www.google.com/search?q=inurl%3Aregister+expressionengine+registration">inurl:register &#8220;expressionengine&#8221; registration</a> <em>About 15,600 results at this time<br />
</em></p>
<p>Even if you changed the member profile trigger word, your site would still bear the telltale footprint &#8220;ExpressionEngine&#8221; and show up in searches similar to the above example.</p>
<p>The phrase ExpressionEngine itself is not the only footprint that can be targeted by spammers. There are many other advanced &#8220;footprint&#8221; searches that can turn up EE and other cms sites to add to spam targeting lists.</p>
<p>Most of these relate to the default text for registration fields, comment fields, footer, etc&#8230;</p>
<p>Footprints like:</p>
<ul>
<li>&#8220;Password Confirm&#8221;</li>
<li>&#8220;Screen Name&#8221;</li>
<li>&#8220;notify me of follow-up comments&#8221; &#8211;&gt;About 156,000,000 results</li>
<li>&#8220;Remember my personal information&#8221; &#8211;&gt; About 1,640,000 results</li>
</ul>
<p>Unfortunately <strong>removing these footprints is the only long term strategy</strong> for stopping or at least minimizing the impact of spam on your EE website.</p>
<h2>Human spammers</h2>
<p>There is no doubt that most of the spamming is done by bots or software, but there are several overseas outfits that employ actual humans to do this.</p>
<p>This means that <strong>advanced captcha and reCaptcha</strong> tricks are only going to <strong>maybe stop some of the automated spam</strong>. Human influence has been apparent from some of the EE member profile spam I have seen.</p>
<h2>At the very least deny the benefit</h2>
<p>You should stop your member list pages from being indexed by <a href="http://expressionengine.com/docs/cp/admin/members_and_groups/member_groups_edit.html">turning off the  Guest Member Group’s ability to view Public Profiles</a>. Plus, you can block search engine spiders from member profiles via robots.txt</p>
<blockquote><p>User-agent: *<br />
Disallow: /member/<br />
Disallow: /forums/member/</p></blockquote>
<p>This will make the spammers attempts at gaining backlinks fail, because the member profiles will not be indexed by search engines and <strong>will not count as backlinks</strong> for the spam websites.</p>
<p>While you&#8217;re at it, add the member registration forms to the  robots.txt as well. This <strong>may</strong> keep your registration forms out of the  search index and make them harder for spammers to find:</p>
<blockquote><p>User-agent:  *<br />
Disallow: /member/register/<br />
Disallow: /forums/member/register/</p></blockquote>
<p>Member registration, comment and other spam is quite an annoyance, but by following the tips above, you may be able to reduce it&#8217;s impact on your ExpressionEngine website.</p>
<p>Good luck!</p>
<p><a href="http://www.woothemes.com/amember/go.php?r=35118&#038;i=b69" target="_blank"><img src="http://www.woothemes.com/ads/ee/EE_728x90.jpg" border=0 alt="WooThemes - Same great themes, different platform." width=728 height=90></a></p>
<p><a href="http://www.capelinks.com/blog/fighting-member-registration-spam-expressionengine/">Fighting Member Registration Spam in ExpressionEngine</a> is from <a href="http://www.capelinks.com/blog/">CapeLinks Internet Marketing Blog</a> located on <a href="http://www.capelinks.com/">Cape Cod</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.capelinks.com/blog/fighting-member-registration-spam-expressionengine/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>404 Error Control in Expression Engine</title>
		<link>http://www.capelinks.com/blog/404-error-control-expression-engine/</link>
		<comments>http://www.capelinks.com/blog/404-error-control-expression-engine/#comments</comments>
		<pubDate>Sun, 07 Feb 2010 03:22:58 +0000</pubDate>
		<dc:creator>CapeLinks</dc:creator>
				<category><![CDATA[Web Development]]></category>
		<category><![CDATA[ExpressionEngine]]></category>

		<guid isPermaLink="false">http://www.capelinks.com/blog/?p=434</guid>
		<description><![CDATA[<p>One of the things that has always concerned me about using Expression Engine is 404 error control. With the template system being so flexible with segments, includes, etc&#8230; there is a &#8220;vulnerability&#8221; when it comes to 404 error control. Not so much for a small EE site, but on large scale applications there could be problems with large numbers of bogus urls returning 200 status codes.</p> <p>Problems could be caused by template coding errors or on the darker side of things it could be used by a competitor to damage your search engine rankings or create other problems for you.</p> [...]<p><a href="http://www.capelinks.com/blog/404-error-control-expression-engine/">404 Error Control in Expression Engine</a> is from <a href="http://www.capelinks.com/blog/">CapeLinks Internet Marketing Blog</a> located on <a href="http://www.capelinks.com/">Cape Cod</a>.</p>
]]></description>
			<content:encoded><![CDATA[<p>One of the things that has always concerned me about using Expression Engine is 404 error control. With the template system being so flexible with segments, includes, etc&#8230; there is a &#8220;vulnerability&#8221; when it comes to 404 error control. Not so much for a small EE site, but on large scale applications there could be problems with large numbers of bogus urls returning 200 status codes.</p>
<p>Problems could be caused by template coding errors or on the darker side  of things it could be used by a competitor to damage your search engine  rankings or create other problems for you.</p>
<p>They could <a title="forum spamming software" href="http://en.wikipedia.org/wiki/XRumer">point a bunch of links</a> at some bogus urls on your EE site and create duplicate content problems or invent new urls for you, like</p>
<p>yoursite.com/services/consulting/<strong>WE-WILL-RIP-YOU-OFF</strong>/<br />
or perhaps something even nastier than that like<br />
<a href="http://expressionengine.com/showcase/interview/ovation_guitars/DOWNLOAD-WORDPRESS-INSTEAD/">http://expressionengine.com/showcase/interview/ovation_guitars/DOWNLOAD-WORDPRESS-INSTEAD/<strong></strong></a><br />
You get the picture&#8230;</p>
<p>Now I realize that this could also be taken care of by the <a href="http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html">canonical meta tag</a>, but that&#8217;s a wicked cop out. Plus that canonical tag stuff is pretty new and I&#8217;ve been using Expression Engine since it was called pMachine.</p>
<p>For the code examples below let&#8217;s assume that you have ditched the index.php from your EE urls. So instead of the out of the box install (yoursite.com/index.php/blog/post-url-title/) you have clean urls like yoursite.com/blog/post-url-title/</p>
<h2>For Single Entry pages</h2>
<p>On single entry pages like for displaying a blog post you might set up your site so your blog&#8217;s index page is blog/index where blog is also the name of your template group. Your blog/index template displays your blog&#8217;s home page, but also displays your blog posts via a conditional and the segment_2 variable. So if the page requested is yoursite.com/blog/another-blog-post/ it displays the weblog entry with the url_title &#8220;another-blog-post&#8221;. That&#8217;s the way this blog is set up.</p>
<p>What happens when there is a request for yoursite.com/blog/bogus-post-url/ ? You want this to return a 404 error and not just display your blog/index template.</p>
<p>Use the <a href="http://expressionengine.com/docs/modules/weblog/parameters.html#par_req_entry">require_entry</a> parameter in your weblog tag and use the <a href="http://expressionengine.com/docs/modules/weblog/conditional_variables.html">no_results conditional</a> to redirect any bogus urls to your 404 error template.</p>
<p><code>{if segment_2 == ""}<br />
{!--- THIS IS THE BLOG HOME PAGE -----}<br />
{!--- the code to display the BLOG HOME PAGE ----}<br />
...<br />
...<br />
{/if}<br />
{if segment_2}<br />
{!---- THIS IS A SINGLE ENTRY BLOG POST -----}<br />
{!--- NOT USING segment_3 IN THIS TEMPLATE, SO IF segment_3 then 404 it ---}<br />
{if segment_3 != ""}{redirect="404"}{/if}<br />
&lt;head&gt;<br />
{exp:weblog:entries weblog="blog" limit="1" require_entry="yes" rdf="off" url_title="{segment_2}"}<br />
{!---- IF url_title DOES NOT MATCH ANY EXISTING ENTRIES then 404 it ----}<br />
{if no_results}{redirect="404"}{/if}<br />
&lt;title&gt;{title} - Your Blog&lt;/title&gt;<br />
{!--- the rest of the code to display the rest of this blog post ----}<br />
...<br />
...<br />
{/exp:weblog:entries}<br />
{/if}</code></p>
<p>This way when a request for a bogus url like yoursite.com/blog/bogus-post-url/ is handled it redirects to your 404 error template instead of returning a 200 status code and displaying an empty template or your blog/index template. Since we are not using the segment_3 variable in this template we also took care of that as well.</p>
<h2>For Pages module pages</h2>
<p>Let&#8217;s say you have a setup like this:</p>
<p>The About section of your website is yoursite.com/about/ which is also what you named your template group and the template for this index page is about/index.</p>
<p>Instead of making this a &#8220;static&#8221; page with the pages module (which isn&#8217;t ideal for an index page) you made a template where you have the navigation/site map to all your /about/ pages via a weblog tag that pulls the entries from your pages weblog and sorts them however you need, by category or whatever.</p>
<p>This way when you add more pages to this section they automatically appear on the index page without having to modify a static index page, etc&#8230;</p>
<p>When the page module page is called it is displayed through another template: pages/index</p>
<p>Then in the pages module you set your page urls to be virtual subdirectories off of the /about index page you set up above. So your urls are something like this:</p>
<p>yoursite.com/about/ &#8212; the index page for this section</p>
<p>yoursite.com/about/me/interests/ &#8212; a page displayed via pages/index</p>
<p>yoursite.com/about/me/hobbies/ &#8212; a page displayed via pages/index</p>
<p>yoursite.com/about/company/services/ &#8212; a page displayed via pages/index</p>
<p>and so forth&#8230;</p>
<p>There&#8217;s two things going on here when you get a bogus url request like: yoursite.com/about/foo/bar/</p>
<p>First EE is going to look for a page to display and it doesn&#8217;t find one because you don&#8217;t have a page with the url about/foo/bar.</p>
<p>Next it&#8217;s going to look for a template in the url which it does find (the about/index template) and appends the /foo/bar segments on to it.</p>
<p>Without any 404 error checking that page and any other bogus urls pointed at your /about section are going to display your about/index template and return a 200 status code.</p>
<p>Lock this down by using <a href="http://expressionengine.com/docs/cp/templates/global_template_preferences.html ">Strict Urls</a> and put the following code at the top of your about/index template</p>
<p><code>{if segment_2 != ""}{redirect="404"}{/if}</code></p>
<p>What this does is require a valid template for segment _2 of the url or in the above example the /foo/ part of about/foo/bar. So if there is no about/foo template in your about template group it will show a 404 error.</p>
<p><em>You should use this on every template where there should/will be no segment variables appended on to the url past the segment that calls the template (or the last segment used in that template).<br />
</em></p>
<p>Let&#8217;s say you have a template named privacy in your about template group that is accessed by yoursite.com/about/privacy. We already took care of the about/index template above, but you should use the strict url segment conditional above on your about/privacy template which would check for a bogus segment_3:</p>
<p><code>{if segment_3 != ""}{redirect="404"}{/if}</code></p>
<p>That way it will give a proper 404 status code if someone requests yoursite.com/about/privacy/does-not-exist/</p>
<p>On your pages/index template (that we set up above to handle the display of the entries in the pages weblog) use the require_entry parameter and the no_results conditional above. It should look something like this</p>
<p><code>{exp:weblog:entries weblog="pages" limit="1" require_entry="yes" rdf="off"}<br />
{if no_results}{redirect="404"}{/if}<br />
&lt;head&gt;<br />
&lt;title&gt;{title} - Your Site&lt;/title&gt;<br />
{!--- the rest of the code to display the rest of this page ----}<br />
...<br />
...<br />
{/exp:weblog:entries}</code></p>
<p>That way any direct requests to your pages/index template will give a 404.</p>
<p>There are probably other methods for controlling access to bogus urls and serving proper 404 status codes in EE, but these are a few examples of what I have been using over the last several years.</p>
<p>Again this really isn&#8217;t a security problem, but it could end up being a pain in the ass.</p>
<h2>Keeping track of segments</h2>
<p>One helpful tip when building an EE site is to use the following code in your footer_template to keep track of your url segments. I assume that you are building your EE site with includes for the head section and other includable parts of your layout like the navigation, sidebars, footer, etc&#8230; Structuring your templates this way makes it much easier to maintain your site going forward.</p>
<p>Assuming that your admin login name is &#8220;superuser&#8221;, put the following in one of your global template includes like the footer:</p>
<p><code>{if username == "superuser"}<br />
URL segments:&lt;br /&gt;<br />
1 - {segment_1}&lt;br /&gt;<br />
2 - {segment_2}&lt;br /&gt;<br />
3 - {segment_3}&lt;br /&gt;<br />
4 - {segment_4}&lt;br /&gt;<br />
5 - {segment_5}&lt;br /&gt;<br />
6 - {segment_6}<br />
{/if}</code></p>
<p>This way when you are in development mode you will be able to see the labeled url segments for every section and page of your site. This makes it much easier, especially if you are using segment conditionals to control the display of page content. I got this tip a few years ago from the EE forums and it has been a big help.</p>
<p style="text-align: center;"><a href="http://www.woothemes.com/amember/go.php?r=35118&amp;i=b72" target="_blank"><img class="aligncenter" src="http://www.woothemes.com/ads/ee/EE_300x250.jpg" border="0" alt="WooThemes - Now for EE" width="300" height="250" /></a></p>
<p><a href="http://www.capelinks.com/blog/404-error-control-expression-engine/">404 Error Control in Expression Engine</a> is from <a href="http://www.capelinks.com/blog/">CapeLinks Internet Marketing Blog</a> located on <a href="http://www.capelinks.com/">Cape Cod</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.capelinks.com/blog/404-error-control-expression-engine/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

