The Fubra Blog
Furnish.co.uk – how we automatically categorise content
From the outside, furnish.co.uk looks pretty simple; thousands of fabulous interiors products from multiple different stores, all easily searchable and nicely categorised.
But, in the background, there’s some seriously clever stuff going on. We’re automatically scraping stores’ websites (with their permission of course) or scanning their feeds at regular intervals, and then subjecting this data to heavy processing to determine category, style, colour, materials, etc and also doing some very leading edge stuff to make our custom-developed search engine produce super-accurate results.
For me, the most important thing about furnish.co.uk is that when a user types “wooden coffee tables” into the search or navigates to coffee tables or table lamps, that’s precisely what they get. They don’t see a coffee jug under coffee tables. I know of no sites that do this particularly well.
So, I thought I’d write about how we achieve some of this stuff – to start with, how we automatically and accurately categorise the thousands of products into our category hierarchy; how we know a bed is a bed, a wall light is a wall light, etc. I’m not the main technical guy at furnish.co.uk – that’s Alan – so I’ll keep things at a relatively non-technical level. But, hopefully, you’ll find it pretty interesting.
Categories and clones
We’ve constructed a hierarchy of hundreds of categories, covering every interiors product for the home. We did this by analysing other sites that sell home interiors products, e.g. John Lewis, Graham and Green, Marks and Spencer, Heals and then put together something that we thought worked.
We decided to let users have multiple ways to navigate through the category hierarchy to the same product for ease of use. However, for simplicity, each product only actually has a single category. That’s because we have ‘clone categories’, where a category can exist in multiple places. For example, Rooms -> Bedrooms -> Bedroom furniture -> Bedside tables shows the same items as Products -> Furniture -> Bedroom furniture -> Bedside tables. One is a clone of the other and means the item itself only needs a single category.
Assigning categories
Based on the above, we automatically assign categories to items. For each item imported, we do the following:
- We pull several fields out of the item that we think will give us a hint to its category and we prioritise these fields. For example, we may think that the item name is most likely to give us a hint, followed by the description. The choice of fields and their priority varies by supplier; some suppliers actually have a categories field that we can use.
- Next, we take each of these fields in turn (highest priority first) and attempt to determine the category. We do this using a HUGE synonyms library that we’ve painstakingly put together from scratch, where each category has a set of associated synonyms. You can see a screen shot from our back end system below. Furthermore, each synonym is prioritised. So, the system finds all synonyms contained within the field that it’s analysing, but then chooses the one with the highest priority. There are also negative synonyms, i.e. where an item cannot be a certain category if it contains certain words. In the event the field contains multiple synonyms with equal priority, it uses the one that comes first in the field.
- If there are no matches, it moves onto the next field within the item to see if that contains a synonym.
- This is repeated for all items being imported. On rare occasions, no synonyms are found and items end up with no category. However, these are not published.
We’ve found that our processing and synonyms library is sufficiently good that any new supplier coming on board tends to get 95% correct categorisation. We then manually tweak the synonyms to ensure 100% accuracy for the supplier.
All pretty elaborate, but the result is clear and accurate navigation. We also use some of this information for the search index, but more on that in a future post.
Tags: Decoration, Furniture, Home, House, Interior

Pretty impressive technology you guys have built here, Simon. Appreciate you sharing the behind the scenes peak!