The Fubra Blog

Furnish.co.uk – how we automatically categorise content

Posted Tuesday 2nd March 2010 by Simon

From the outside, furnish.co.uk looks pretty simple; thousands of fabulous interiors products from multiple different stores, all easily searchable and nicely categorised.

But, in the background, there’s some seriously clever stuff going on. We’re automatically scraping stores’ websites (with their permission of course) or scanning their feeds at regular intervals, and then subjecting this data to heavy processing to determine category, style, colour, materials, etc and also doing some very leading edge stuff to make our custom-developed search engine produce super-accurate results.

For me, the most important thing about furnish.co.uk is that when a user types “wooden coffee tables” into the search or navigates to coffee tables or table lamps, that’s precisely what they get. They don’t see a coffee jug under coffee tables. I know of no sites that do this particularly well.

So, I thought I’d write about how we achieve some of this stuff – to start with, how we automatically and accurately categorise the thousands of products into our category hierarchy; how we know a bed is a bed, a wall light is a wall light, etc. I’m not the main technical guy at furnish.co.uk – that’s Alan – so I’ll keep things at a relatively non-technical level. But, hopefully, you’ll find it pretty interesting.

Categories and clones
We’ve constructed a hierarchy of hundreds of categories, covering every interiors product for the home. We did this by analysing other sites that sell home interiors products, e.g. John Lewis, Graham and Green, Marks and Spencer, Heals and then put together something that we thought worked.

We decided to let users have multiple ways to navigate through the category hierarchy to the same product for ease of use. However, for simplicity, each product only actually has a single category. That’s because we have ‘clone categories’, where a category can exist in multiple places. For example, Rooms -> Bedrooms -> Bedroom furniture -> Bedside tables shows the same items as Products -> Furniture -> Bedroom furniture -> Bedside tables. One is a clone of  the other and means the item itself only needs a single category.

Assigning categories
Based on the above, we automatically assign categories to items. For each item imported, we do the following:

  1. We pull several fields out of the item that we think will give us a hint to its category and we prioritise these fields. For example, we may think that the item name is most likely to give us a hint, followed by the description. The choice of fields and their priority varies by supplier; some suppliers actually have a categories field that we can use.
  2. Next, we take each of these fields in turn (highest priority first) and attempt to determine the category. We do this using a HUGE synonyms library that we’ve painstakingly put together from scratch, where each category has a set of associated synonyms. You can see a screen shot from our back end system below. Furthermore, each synonym is prioritised. So, the system finds all synonyms contained within the field that it’s analysing, but then chooses the one with the highest priority. There are also negative synonyms, i.e. where an item cannot be a certain category if it contains certain words. In the event the field contains multiple synonyms with equal priority, it uses the one that comes first in the field.
  3. If there are no matches, it moves onto the next field within the item to see if that contains a synonym.
  4. This is repeated for all items being imported. On rare occasions, no synonyms are found and items end up with no category. However, these are not published.

We’ve found that our processing and synonyms library is sufficiently good that any new supplier coming on board tends to get 95% correct categorisation. We then manually tweak the synonyms to ensure 100% accuracy for the supplier.

All pretty elaborate, but the result is clear and accurate navigation. We also use some of this information for the search index, but more on that in a future post.

Calling all start-ups

Posted Thursday 4th February 2010 by Brendan

Fubra is looking to invest in more successful web start-ups than any other UK investment business in 2010. So if you are running a business online and think you could grow it bigger with our help or investment then you should think about getting in touch.

Before you approach us though, I thought it would be good to give some hints of what we are looking for and what we can offer to you in return for shares in your business. In addition to reading about our company and investments on our site, to see what we are doing, these hints should give you more of an idea what to expect in a meeting with us.

Did you know? In 2009 Fubra invested £250,000 into private companies. In 2010 we want to invest double that if we can find enough good prospects.

But why are we special? Read the rest of this entry »

Accounting Software for 2010

Posted Monday 11th January 2010 by Paul

As you might have seen from some of our blog posts last week – we had a lot of fun with all the fresh snow that arrived – but how did your business cope with the disruption to the country’s transport network? Luckily for us, as a web based company, if our staff can’t make it in to the office, they have most of the tools they need to work from home.

One tool in particular is an accounting software service that we’ve been working on for the past 18 months – it’s called Clear Books.

Traditional Accounting Software

Traditional accounting software typically provides a fairly archaic solution. Here’s the scene: You have an old windows machine in the corner of your office that has the accounting software installed on it. Your accountant or book keeper comes in every now and then to keep the system up to date. If you want some management accounts to see how your business is doing, you or someone working for you has to log on to that machine and export them. If it snows – your book keeper can’t get to the office and your accounts start gathering dust. If you are at home one evening, and you want to know how much you owe the VAT man – you have to wait until the next day to visit the office. You’ve paid a large amount up front for the privilege of installing the software, and then each year you have to pay to keep it up to date.

Traditional accounting software….

  • Comes in a box
  • Needs to be installed on a computer
  • It can only be accessed from one computer, by one person at a time
  • Has a hefty up front cost.

No Accounts Software

Some people don’t like using these old accounting software products, so they opt for an even more labour intensive approach; they get their accountant to physically collect all their invoices, and bank statements (also know as the black bin liner method), so that he can prepare their payroll, VAT and end of year accounts, and then physically deliver them back to them when he’s finished with them. This method typically gives you no ongoing picture of your businesses performance, and you have to rely on bank statements, spreadsheets or just your gut feeling to see you from year to year. Also, if you need to refer to an old document – you normally can’t because they are with your accountant.

Online Accounting

Then there is the new approach. We call this online accounting. In this scenario, the accounting software is accessed via a web browser. You don’t need to install anything on your PC.  You can view your accounts from any computer (PC, Mac, Linux) with an Internet connection. So can your accountant or book keeper, so he/she no longer has to drive to your office. You and your business partners can be accessing the software at the same time, from any where in the world. There are no major up front costs, you just pay a small monthly subscription – and regular software updates are included as part of this. The software automatically generates your VAT returns each quarter, and lets you file them with HMRC with a few clicks of the mouse.

Online accounting software….

  • Can be accessed from any computer with an Internet connection and a web browser.
  • Requires no software installation, updates or upgrades.
  • Has a low monthly cost, with no up front fees.

If this sounds like something you want to know more about, you might want to read my 10 reasons to use Clear Books post.

Clear Books gets a new front cover

Posted Wednesday 14th October 2009 by Steve

A few weeks ago work started on updating and restructuring the design of Clear Books, an online accounting software product to free your time.

The beta version was launched back in December last year (2008) and since then the software has shed it’s beta label (July ‘09) and evolved rapidly, boasting a number of great features and reasons to use Clear Books.

Clear Books Revamp

Read the rest of this entry »

Fubra investment grows traffic to 100,000 users per day

Posted Tuesday 13th October 2009 by Pete

Renegade Games Limited, the investment Fubra made in Febuary 2009 has just acquired four more websites, all in the gaming industry. All four websites are “top lists”, websites that rank games according to votes that their players send in, thereby creating a ranking page of which games are popular.

The websites were bought after tough negotiations with the old owner, however a deal both were happy with was reached eventually.

The four new websites are:

Top 100 Arena

Gaming Sites 100

Game Sites 101

MMORPG 101

These websites will complement the games already owned by Renegade Games, including the game that is newly being developed from scratch, codenamed SimRepublic.

The websites give a great platform for games to advertise, as the websites now owned by Renegade Games top 100,000 users per day. This is good progress after about 8 months of building websites.

top100arenammorpg101gamingsites100gamesites101