From mike at fuckingbrit.com Sat Feb 23 08:12:58 2008 From: mike at fuckingbrit.com (Michael Jervis) Date: Sat, 23 Feb 2008 13:12:58 +0000 Subject: [geeklog-spam] SWOT 0.2 Message-ID: <7b42e7470802230512x4cc67a1bi867c3d1b3d8ac5a9@mail.gmail.com> http://swot.fuckingbrit.com/ Updated. Any further comment? Does anyone think this is a good idea? Anyone think it's a bad idea? Personally, I think it's a good idea with a lot of potential, but I may be being naive and/or have missed a few critical things that render this pointless. Prior to trying to make it into a fully fledged SoC project and wasting a load of some students time on it, it'd be good to get some initial feedback! ;-) Cheers, Mike -- Michael Jervis mjervis at gmail.com 504B03041400000008008F846431E3543A820800000006000000060000007765 62676F642B4F4D4ACF4F0100504B010214001400000008008F846431E3543A82 0800000006000000060000000000000000002000000000000000776562676F64 504B05060000000001000100340000002C0000000000 From dirk at haun-online.de Sat Feb 23 09:11:54 2008 From: dirk at haun-online.de (Dirk Haun) Date: Sat, 23 Feb 2008 15:11:54 +0100 Subject: [geeklog-spam] SWOT 0.2 In-Reply-To: <7b42e7470802230512x4cc67a1bi867c3d1b3d8ac5a9@mail.gmail.com> References: <7b42e7470802230512x4cc67a1bi867c3d1b3d8ac5a9@mail.gmail.com> Message-ID: <20080223141154.998435327@smtp.haun-online.de> Michael Jervis wrote: >http://swot.fuckingbrit.com/ Some thoughts: I assume you would keep track of where an entry came from? Let's say I notice false positives and they're coming from a certain hop range. I'd like to re-adjust my hop count and get rid of those entries automatically. Also, is there a requirement to share? On one of my sites, I'm filtering fairly aggressively, so not only have I blocked most of eastern Europe, but also some US ISPs. That's fine for the site in question (which is in German) but obviously I don't want to share those IPs. Or maybe that's already addressed by having two separate blacklists: My SWOT list and the standard Spam-X IP blacklist. It will be interesting to see how the trust thing works out. bye, Dirk -- http://www.geeklog.net/ http://geeklog.info/ From mjervis at gmail.com Sat Feb 23 14:12:21 2008 From: mjervis at gmail.com (Michael Jervis) Date: Sat, 23 Feb 2008 19:12:21 +0000 Subject: [geeklog-spam] SWOT 0.2 In-Reply-To: <20080223141154.998435327@smtp.haun-online.de> References: <7b42e7470802230512x4cc67a1bi867c3d1b3d8ac5a9@mail.gmail.com> <20080223141154.998435327@smtp.haun-online.de> Message-ID: <7b42e7470802231112m3edc40c1k5d992e7843adf644@mail.gmail.com> > I assume you would keep track of where an entry came from? Let's say I > notice false positives and they're coming from a certain hop range. I'd > like to re-adjust my hop count and get rid of those entries automatically. http://swot.fuckingbrit.com/#rules Point 5 on the importing feeds: It should be possible to mark a particular source of regular expressions as untrusted, no matter who proxies that source to your system, they should be ignored by your system. So if you subscribe to my SWOT feed, and start getting stuff from someone I trust, but you now don't trust, you can black list that item from all sources. I hadn't thought of the implications of changing the hop range for a particular source, but it would make sense if the reference implementation auto-removed entries that come from further down on the hop chain from that feed. That would require you to record feed source and final source against each item imported. > Or maybe that's already addressed by having two separate blacklists: My > SWOT list and the standard Spam-X IP blacklist. Yeah, personally, I'd probably use Spam-X for my personal enemy items, things that I want to block but not everyone would and only use the SWOT blacklist for "generics". But also, I was only thinking (not by any concious choice, just through lazyness) that I was sharing content filters, not where urls resolve to or who the posting host is. Should other types of filter be shared on SWOT? Cheers, Mike From dirk at haun-online.de Sat Feb 23 15:47:13 2008 From: dirk at haun-online.de (Dirk Haun) Date: Sat, 23 Feb 2008 21:47:13 +0100 Subject: [geeklog-spam] SWOT 0.2 In-Reply-To: <7b42e7470802231112m3edc40c1k5d992e7843adf644@mail.gmail.com> References: <7b42e7470802230512x4cc67a1bi867c3d1b3d8ac5a9@mail.gmail.com> <20080223141154.998435327@smtp.haun-online.de> <7b42e7470802231112m3edc40c1k5d992e7843adf644@mail.gmail.com> Message-ID: <20080223204713.1295182137@smtp.haun-online.de> Michael Jervis wrote: >But also, I was only thinking (not by any concious choice, just >through lazyness) that I was sharing content filters, not where urls >resolve to or who the posting host is. > >Should other types of filter be shared on SWOT? Sorry, for some reason I thought that SWOT would be dealing with IP addresses only. Don't ask me how I came to that conclusion from the "Blacklist regular expressions" bit ... Anyway, I think it would be useful to share IP addresses (including address ranges). You would need a "type" attribute then, though, to tell them apart from regular expressions and so that the filter knows what to do with them. Over time, I made quite a few "block this ip address (range)"-sort of posts[1] on my blog. And if I had a SWOT feed, I would then want to add them to it. bye, Dirk [1] e.g. -- http://spam.tinyweb.net/ From ironmax at spacequad.com Sun Feb 24 06:38:45 2008 From: ironmax at spacequad.com (Michael Brusletten) Date: Sun, 24 Feb 2008 06:38:45 -0500 Subject: [geeklog-spam] Spam status References: Message-ID: <000e01c876d9$d08afe90$fe00a8c0@ns2.spacequad.com> To all, I'm really courious how the last month has gone for web spam. I would really like to know if I've been making a real dent in the reduction or not. I have noticed on my system that the google spam has disappeared completely now. What are your thoughts on all this? Michael From mjervis at gmail.com Wed Feb 27 05:33:08 2008 From: mjervis at gmail.com (Michael Jervis) Date: Wed, 27 Feb 2008 10:33:08 +0000 Subject: [geeklog-spam] SWOT 0.2 In-Reply-To: <20080223204713.1295182137@smtp.haun-online.de> References: <7b42e7470802230512x4cc67a1bi867c3d1b3d8ac5a9@mail.gmail.com> <20080223141154.998435327@smtp.haun-online.de> <7b42e7470802231112m3edc40c1k5d992e7843adf644@mail.gmail.com> <20080223204713.1295182137@smtp.haun-online.de> Message-ID: <7b42e7470802270233m5d264f10t2a8cd44014f4457c@mail.gmail.com> > Sorry, for some reason I thought that SWOT would be dealing with IP > addresses only. Don't ask me how I came to that conclusion from the > "Blacklist regular expressions" bit ... Probably via the exact converse of the way I assumed it wasn't. > address ranges). You would need a "type" attribute then, though, to tell > them apart from regular expressions and so that the filter knows what to > do with them. I'll update the RFC to include types. Other than: 1) Post content Regexp 2) IP Address 3) IP Address Range 4) HTTP Header content regexp Is there any other kind of item we should have? Cheers, Mike From dirk at haun-online.de Wed Feb 27 14:36:11 2008 From: dirk at haun-online.de (Dirk Haun) Date: Wed, 27 Feb 2008 20:36:11 +0100 Subject: [geeklog-spam] SWOT 0.2 In-Reply-To: <7b42e7470802270233m5d264f10t2a8cd44014f4457c@mail.gmail.com> References: <7b42e7470802230512x4cc67a1bi867c3d1b3d8ac5a9@mail.gmail.com> <20080223141154.998435327@smtp.haun-online.de> <7b42e7470802231112m3edc40c1k5d992e7843adf644@mail.gmail.com> <20080223204713.1295182137@smtp.haun-online.de> <7b42e7470802270233m5d264f10t2a8cd44014f4457c@mail.gmail.com> Message-ID: <20080227193611.689972099@smtp.haun-online.de> Michael Jervis wrote: >I'll update the RFC to include types. Other than: > >1) Post content Regexp >2) IP Address >3) IP Address Range >4) HTTP Header content regexp > >Is there any other kind of item we should have? No, I think that covers the things you'd want to share. bye, Dirk -- http://www.haun-online.de/ http://spam.tinyweb.net/