Ruby on Rails
Subscribe to Ruby on Rails articles
Tags:
Programming
Ruby on Rails
I was struggling with SOAP for the first time today. No, today is not the first day I bathe! SOAP is some crazy XML communication protocol that seems like overkill to me.
Anyway, I was using Ruby to send and receive the request, and I was getting a baffling SOAP::Mapping::Object object, and I had no idea how to use it. It's a class with dynamically generated methods, and none of the ones I was expecting to see were there.
Blogosphere to the rescue! Marty Haught's Weblog gave a simple trick that saved the day and helped me narrow down my problem to let me see the solution within minutes.
To inspect the resulting object, I was looking at the output of this:
@results.public_methods.sort.join(' ')
Which gives a huge list of methods.
Thanks to Marty, I did this instead:
(@results.public_methods - Object.public_methods).sort.join(' ')
Subtracting Object.public_methods removes all the methods that all Ruby objects get, showing me only the methods that are different. I got this list:
[] []= __add_xmlele_value __xmlattr __xmlele out out=
Sure enough, the "out" method stood out, so I investigated it. A few minutes later, I had navigated through the class and found what I needed. Sweet!
I post this here so that others might discover this simple trick and solve whatever problem is keeping them up late at night.
Tags:
Programming
Ruby on Rails
If you’re using Ruby on Rails to build your website, then you’ve probably got a lot of dynamically generated pages. In this blog, I have one page for every article I write. I also have a page for each month, and a page for each year. I also have a page for each tag which lists all the articles with those tags. It isn’t too much, so Google's web crawler can keep up with it and get everything indexed.
But I have another site with hundreds of individual pages, and Google’s crawler stops indexing the site after only a few dozen pages, missing the important ones. If one of my pages isn't in Google's index, then it isn't working for me! It's just wasting hard drive space. It was obvious that I needed to tell Google what to crawl by giving it a sitemap. Read Google's docs about sitemaps, and be familiar with Google's Webmaster Tools in order to use submit sitemaps. Basically, a sitemap acts as a table of contents for Google, so it can understand which pages it should be looking at when it visits your site.
Because Rails is so awesome, there’s an easy way to generate sitemaps using Ruby. It should only take you a few minutes to write your sitemap code once you have a look at how it’s done. I’ll give a brief overview here.
Click to continue reading...
In the same way that you can generate HTML files using .rhtml files, you can create XML files using .rxml. First, let’s look at Google’s sitemap format to see what kind of XML document we need to build. Here’s an example sitemap:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.geekskillz.com/tags/1</loc>
<priority>0.9</priority>
<changefreq>weekly</changefreq>
</url>
<url>
<loc>http://www.geekskillz.com/tags/2</loc>
<priority>0.8</priority>
<changefreq>weekly</changefreq>
</url>
...
</urlset>
Each page you want Google to crawl is contained in a url element, which contains a mandatory loc element and a bunch of optional elements. I use the priority element so that the more important pages get crawled if Google is too lazy to crawl my entire site, which is usually what happens. It’s like saying, “Please index this one first, and if you have time, here are the other ones you should index.”
Priority is a value between 0 and 1, and the values are compared with each other. So setting all your url’s to a priority of 1.0 is pointless. 1.0 doesn't mean that your web pages are super important. It would mean that all your pages are equally important in relation to each other, which doesn’t help the web crawler at all. It wants to know where it should start crawling, not how important you think your web page is compared to everyone else on the internet. We know your mom says you're special, but Google doesn't consider her to be credible on these matters.
The changefreq element is another hint to Google, suggesting how often it should return to find new content on your website. It’s something that might get ignored, but it’s probably worth using. Google’s sitemap documentation says that these hints may be ignored, so don’t spend too much time thinking about them.
Now to the Ruby code. To create a sitemap that keeps up to date on your site’s content, you can use a .rxml file. I named mine sitemap.rxml. The file is written in Ruby, and uses the xml object to create the XML elements in the output. Here’s a small example that shows how it basically works:
xml.instruct! :xml, :version=>"1.0"
xml.outside{
xml.inside("Hello World!")
}
This creates this output XML file:
<?xml version="1.0" encoding="UTF-8"?>
<outside>
<inside>Hello World!</inside>
</outside>
So, you create an element by calling the xml object with the name of the element you want to create ("outside" and "inside" in the above example). To put a value inside the element, you pass it in as an argument (like "Hello World!" in the example). Easy pickings!
And really, that’s all you need to learn to start building your sitemap. So, here’s an example .rxml file that creates a sitemap:
xml.instruct! :xml, :version=>"1.0"
xml.urlset(:xmlns => "http://www.sitemaps.org/schemas/sitemap/0.9"){
# High priority pages:
@important_tags.each do |tag|
xml.url{
xml.loc("http://www.website.com" + tag_path(tag))
xml.priority(0.9)
xml.changefreq("weekly")
}
end
# Low priority pages:
# We want the priority value to range from 0.2 to 0.8
@priority_multiplier = 0.6 / @low_pri_pages.size.to_f
@ low_pri_pages.each_with_index do |tag, i|
xml.url{
xml.loc("http://www. website.com" + tag_path(tag))
priority = 0.8 - (i * @priority_multiplier)
xml.priority( "%.2f" % priority )
}
end
}
This is doing something a bit fancy with the priority value. The @low_pri_pages array has been sorted (in the controller) in order of most important to least important (based on some criteria). Then the priority value is calculated based on the order of the array. So when this .rxml file is run, we’ll get something like this:
...
<url>
<loc>http://www.website.com/tags/10</loc>
<priority>0.72</priority>
</url>
<url>
<loc>http://www. website.com/tags/11</loc>
<priority>0.71</priority>
</url>
<url>
<loc>http://www. website.com/tags/12</loc>
<priority>0.70</priority>
</url>
...
If that part of the example is unclear, forget about it. My point is this: You can use any Ruby code here to generate the sitemap with as much dynamicness (is that a word?) as you like. I chose to dynamically generate the priority value, just because that's how I get my kicks.
Note that I am using Rails 1.2, so I'm using resource helpers like tag_path to get the URLs. If you're not using these helpers, then you can generate your <loc> values using the url_for function like this:
xml.loc("http://www.website.com"
+ url_for(:controller => 'tags', :action => 'show', :id => tag))
Before you run off and create your own sitemap, you should think about where to put it. Which controller owns it? What does the routes.rb file look like?
Google recommends that sitemap.xml be placed at the root of your website. For example, http://www.website.com/sitemap.xml is how Google will look for it. So, your routes.rb file should setup a route to that location to tell your Rails app which controller to use to find the sitemap.rxml file. Here’s what I added to the top of my routes.rb file:
map.connect 'sitemap.xml', :controller => 'products', :action => 'sitemap'
So, I created an action in my products_controller that is responsible for creating the sitemap. Done.
Let me know if you have any tips on using sitemaps with Rails apps. I didn’t spend more than a day on it, so I’m sure there are plenty of useful tips.
Tags:
Programming
Ruby on Rails
I'm back from vacation, back in the city, back in the workforce, and back with my MacBook on my lap. Time to show this blog some loving again.
I'll warm up with a quick programming article. My regular readers will leave now and not notice how rusty my writing is. People stumbling in will have low expectations. Perfect!
 For some reason, I decided to install the long-named IBM OmniFind Yahoo! Edition search engine on my home web server that my web sites run on. I have a web site that could really benefit from a dedicated web crawler, since Google has been ignoring it and only indexing a few pages.
So I point it to the new website, and get this error: "The URL does not have any pages available to crawl". The site has almost 300 pages to crawl, so I scratched my head in confusion and dug deeper to find the root cause of the problem.
I found out that the OmniFind web crawler was requesting pages in XML format, rather than HTML. Since I had not set up the site to respond properly to XML requests, it was giving empty pages to the crawler. So of course, the crawler thought it was looking at an empty web site. From my testing, this will only happen with resources in Ruby on Rails 1.2. Older Rails apps received requests for HTML as I had expected. So, Rails 1.2 is interpreting the OmniFind crawler requests as being for XML, although I could not find anything in the request that indicated this. Could be a bug in Rails 1.2, or ambiguity in the requests.
Click to continue reading...
 This introduced me to the importance for web crawlers to identify themselves to the web sites they are crawling. The OmniFind docs strongly recommended users configure OmniFind so that it provides identification of itself, which is done in the Crawl Web Sites > Manage Web Sites > Crawler Settings section of the admin web app.
So, to identify the crawler, I used my e-mail address and a string I made up: "mighty_omnifind". Then, in my Rails app, I added the following to the application.rb controller helper:
class ApplicationController < ActionController::Base
before_filter :hack_for_omnifind
def hack_for_omnifind
if @request.env["HTTP_USER_AGENT"] == "mighty_omnifind"
params[:format] = 'html'
end
end
end
So, if the HTTP request is coming from something that identifies itself as "mighty_omnifind", then I force the requested format to be HTML. Easy pickings.
I couldn't find any documentation about params[:format], but I'm getting used to the way Rails thinks, so made a guess that such a param exists. Lucky for me, it works!
I recommend IBM OmniFind Yahoo! Edition as an extremely easy-to-use, easy-to-install, and free search engine. The install is 3 clicks, and once you add your web site(s), it should run itself. However, since I have more than one web site running on this one box, it would be nice to be able to have a separate index for each web site. But there are ways to work around this, and it's hard to complain when the price is $0.
Tags:
Programming
Ruby on Rails
I found an interesting presentation about Ruby aimed at Java programmers: 10 Things Every Java Programmer Should Know About Ruby. It says it isn't trying to convince you that Java sucks and Ruby will make you a much happier programmer, but you might start to feel that way by the time you get to the last point. I don't think Ruby can replace Java, but that doesn't stop me from wishing. Ruby is an object-oriented language, but differs from Java's implementation in some important ways.
Number 8 was new information to me, and I had to think about it for a while. It says that almost everything in Ruby is a message. When you call a class method, that call is a message. The method and its arguments are the message, and the class (or instance of the class) is the destination of the message. The class and the message are not joined to each other in any special way. The message could have just as easily been directed at another class object. I now think of Ruby classes this way: Classes are written to receive messages.
Is this only a distinction without a difference? I don't think so. I'll try to illustrate with an example.
Click to continue reading...
A while ago, I was working on a text-based MUD (multi-user dungeon, the ancestor to the modern MMORPG). I was philosophizing about where actions belonged in an object-oriented language. To unlock a door with a key in the game, how would the code be written? door.unlock(key)? key.unlock(door)? unlock(door,key)? use(door,key,unlock)? I argued against the first two examples, for reasons I can't remember. Looking at them side-by-side makes it clear that making one choice over the other may limit your design for any future features. But in practice, either one will probably work fine, so just make a choice.
Now I'm thinking about it again in terms of Ruby's messages. What if a game action was implemented as a message that could be sent to any game object? Would that solve anything? Does it give a level of abstraction that opens up a game's design? Let's see.
I made an example where we've got a "Creature" class, which represent any living thing, including the players, and a "Door" class, which is just a door. Then I implement an explosion in the room that is performed on everything in the room. Since explosions hurt, I will then apologize to everything in the room.
I'm not going to create an "Explosion" class, or an "Apology" class. I'm not even going to make it more abstract by making "Attack" or "TextMessage" classes. I'll just make one class to encapsulate them all: ActionRecorder. The ActionRecorder is like a proxy. You record the "message" (ie, the explosion and apology) into the ActionRecorder, and then perform the action on any other class. This creates an instance of the action that can be reused, serialized, stored, logged, or whatever you want. A game action can have multiple effects, so the ActionRecorder will store each effect as a Ruby message.
Enough blah blah blah, it's time to see some code. Here's the ActionRecorder, the most interesting part of the code (this is taken from the presentation):
class ActionRecorder
def initialize
@messages = []
end
# Record all messages received, using this method:
def method_missing(method, *args, &block)
@messages << [method, args, block]
end
# Send the messages to an object
def perform_on(obj)
@messages.each do |method, args, block|
obj.send(method, *args, &block)
end
end
end
Note that method_missing is a built-in method that all Ruby classes have. It gets called automatically if you call a class for a method that it has not defined. More specifically, it catches the "NoMethodError" exception. Also, all Ruby classes have the send method, which receives all messages (aka, method invocations) and passes them to the methods. I assume that the send method is what throws the NoMethodError exception.
So, the ActionRecorder is taking advantage of method_missing to record all messages that an ActionRecorder instance receives. I'm not sure how you would do this in Java without getting compiler errors. It's probably possible, but won't be as elegant as this Ruby code.
Now let's create some game objects:
class Creature
# Creatures should have names:
def initialize( name )
@name = name
end
# Define some actions that can happen to Creatures:
def physical_damage(num_points)
puts "#{@name} is hit for #{num_points} points!"
end
def fire_damage(num_points)
puts "#{@name} is burned for #{num_points} points!"
end
def love_message(msg)
puts "#{@name} feels the love."
end
# Ignore actions that can't happen to Creatures:
def method_missing(method, *args, &block)
# Think carefully about using this method!
end
end
class Door
# Define some actions that can happen to Doors:
def physical_damage(num_points)
puts "Door sustains #{num_points} points of damage."
end
def fire_damage(num_points)
puts "Door is burned for #{(num_points*1.5).to_i} points, and catches fire!"
self.apply_damage_over_time(1, 10) # not implemented
end
# Ignore actions that can't happen to Doors:
def method_missing(method, *args, &block)
# Think carefully about using this method!
end
end This implementation is a bit sloppy. I would probably implement a "DamageEvents" module and apply it to those two classes, from which they would get the physical_damage and fire_damage methods, but whatever. Also, the method_missing method should not just ignore all illegal methods. There are probably cases where you will want to raise an exception if an object receives an illegal method message. But let's just go with it for the example.
Now, how do you perform actions on the objects? Let's have some action!
# ------- Test the classes ----------------------
puts "Start..."
# Populate the game with stuff:
human = Creature.new("Neil")
door = Door.new
# Put stuff in a room:
things_in_room = [human, door]
# An explosion happens in the room!
explosion = ActionRecorder.new
explosion.physical_damage 2
explosion.fire_damage 10
# See what happens to everything in the room:
things_in_room.each do |thing|
explosion.perform_on thing
end
# Apologize for the explosion
apology = ActionRecorder.new
apology.love_message "Sorry!"
things_in_room.each do |thing|
apology.perform_on thing
end
puts "Done" And here's the output:
Start...
Neil is hit for 2 points!
Neil is burned for 10 points!
Door sustains 2 points of damage.
Door is burned for 15 points, and catches fire!
Neil feels the love.
Done
I definitely like this way of coding this example! The code is written almost in english. Look at the definition of the explosion. It doesn't get much more readable than that! Also, "explosion.perform_on thing" is very easy to read. This is the kind of code that makes a programmer smile.
I can see more ways that the ActionRecorder will facilitate new features. Here are some thoughts about how to extend this type of proxy class:
- ActionRecorder can log all messages it receives, or send the messages to a Logger class that can choose which methods to log and which to ignore, in exactly the same way that we implemented the Creature and Door classes.
- A similar recorder can be used to execute scripted effects. A ScriptRecorder class. So, if your game's content developers write scripts in Lua or some home-grown script language, you can read them in and record them as ScriptRecorder objects that can be applied to game objects. You can write a Lua script that defines the explosion, read it in to a ScriptRecorder, and you've got the implementation of that script done. (Actually, I would use Ruby as the scripting language, so nevermind...)
- In-game chat messages that pass through a recorder (ChatRecorder?) could be parsed, logged, or blocked. Messages may need to go to some players, but not others. Whispers, guild chat, party chat, in-game languages, etc.
- Convert in-game chat messages to telnet protocol strings before sending them over the network.
What do you think? How would you take advantage of this?
Tags:
Programming
Ruby on Rails
 Here’s another article about performance in Ruby on Rails apps. This one has information that is brutally obvious to anyone with database experience, but if you’re more of a web developer than a database admin, then this article may be insightful for you. Check out my first article, Using :select in Rails for Better Performance, for other tips.
I have looked at a lot of other people’s Rails code as I was learning, including their database migrations, and have rarely seen indexes being created with the tables. For those who aren’t using database migrations, take the time to learn about them and you’ll never go back to writing CREATE TABLE statements yourself again! Seriously, migrations might be my favorite feature in Rails!
Anyway, a table without an index is not much of a table at all. Querying a table without an index is a performance problem waiting to happen.
What’s An Index?
In a database, you have a few different kinds of objects. You’ve got tables, which are just big containers in which you put your data. The data is not sorted, so if you want to find something in a table, you need to look through everything until you find what you want.
Click to continue reading...
You also have indexes. An index is a sorted list of rows in a table. It contains a small subset of the table’s data, just enough to sort the data and point to the real data rows in the real table. So, when you want to find something in a table, you look at a suitable index, not the table. The index will point to the rows of the table that match what you are looking for.
A Cheeky Example
Here’s the best example: a phone book. The phone book contains a lot of data, but we know how to find phone numbers very quickly because there is an index on the phone numbers. The phone book is indexed by last name, first name, and address, in that order. So when you want to find Cheeky McMonkey’s phone number, you flip right to the part of the phone book that has last names starting with M, then find Mc, and so on.
Now imagine if the phone book did not have an index. How would you find Cheeky McMonkey’s phone number? You would just start on the first page and look at all the names until you found Cheeky, one row at a time, one page at a time. A few days later, you might have found his phone number. Her phone number? Cheeky? No, it’s a him. His phone number. I should call him more often.
Cheeky in Rails
Let’s do the Rails database migration for a phone book with its index. We’ll create it with this command:
ruby script/generate migration PhoneBook
And here’s what the database migration could look like:
class PhoneBook < ActiveRecord::Migration
def self.up
# Create the table:
create_table :phone_book do |t|
t.column :phone_number, :string, :null => false
t.column :last_name, :string, :null => false
t.column :first_name, :string, :null => false
t.column :address, :string, :null => false
end
# Create an index on the table:
add_index :phone_book, [:last_name, :first_name, :address]
end
def self.down
# Drop the index:
remove_index :phone_book, [:last_name, :first_name, :address]
# Drop the table:
drop_table :phone_book
end
end
Note that the order of the column names in the add_index statement is very important! This index will be used to find rows in the phone_book table when you do PhoneBook.find and your :conditions option specifies the last_name option (and possibly more). The last_name column is the most important column in the index. If a query (find) does not include last_name in the :conditions, then the index will not be used! The entire table will be searched instead.
The index is sorted, so the database can look at it just like we look at a phone book. It flips to the correct pages of the index just like we do when we flip open a phone book. When it finds the terms that match your :conditions, it can then look up the data (the phone_number) in the phone_book table.
Here are some find calls that will use the index and perform very quickly no matter how many millions of rows are in the phone_book table:
PhoneBook.find(:all,
:conditions => [’last_name = ? AND first_name = ? AND address = ?’,
‘McMonkey’, ‘Cheeky’, ‘421 Monkey Banana Crescent’])
PhoneBook.find(:all,
:conditions => [’last_name = ? AND first_name = ?’,
‘McMonkey’, ‘Cheeky’])
PhoneBook.find(:all,
:conditions => [’last_name = ?’, ‘McMonkey’])
PhoneBook.find(:all,
:conditions => [’last_name = ? AND address = ?’,
‘McMonkey’, ‘421 Monkey Banana Crescent’])
Even if your :conditions don’t include the entire column list of the index, the database can still use the index. If you limit your search with last_name, then the index can be used.
Here are queries that cannot use the index, and will perform very poorly:
PhoneBook.find(:all,
:conditions => [’first_name = ? AND address = ?’,
‘Cheeky’, ‘421 Monkey Banana Crescent’])
PhoneBook.find(:all,
:conditions => [’address = ?’, ‘421 Monkey Banana Crescent’])
PhoneBook.find(:all,
:conditions => [’phone_number = ?’, ‘416-555-0880’])
Going back to our phone book analogy, it’s easy to understand why these queries would perform horribly. If you knew a phone number but wanted to know who it belonged to, the phone book won’t be able to help you very well unless you have a lot of time on your hands.
One Index Or Two?
So, what if you wanted to do reverse look-ups like that? If your web site needs to find people’s names based on their phone numbers, then you can just create another index on the table:
add_index :phone_book, :phone_number
Done. You can create as many indexes as you need on a table. It’s like having multiple copies of the same phone book, but each of them sorted differently.
Note however that there is no advantage in having indexes with redundant column lists. For example, these indexes are redundant:
add_index :phone_book, [:last_name, :first_name, :address]
add_index :phone_book, [:last_name, :first_name]
add_index :phone_book, :last_name
Only the first index is needed. The last two are subsets of the first index, so they serve no purpose. The database might give you an error or warning if you try to create them, I’m not sure. It depends on which database product you are using.
What Kind of “Good Performance” Does This Give Me?
By this point, you should imagine if you had multiple copies of the same phone book in your house. It’s great to have phone numbers indexed in different ways, who wouldn't enjoy that? But they sure take up a lot of space on the coffee table. And when phone numbers get added, changed, and removed, you’ve got to replace all your phone books!
This is true in a database too. Having indexes improves query performance, but can hurt other kinds of database operations: inserts, updates, and deletes. If your app changes the data in your database frequently, then you need to consider the performance penalties of indexes on those operations. When table data changes, all the indexes need to be updated. You’ll have to find a balance between query performance and transaction performance. Some testing and measurements may be needed. Once you create a fifth or sixth index on a table, you should really stop and think about it. What kinds of :conditions are you using in your queries? Can they be changed to use existing indexes?
If your database is almost entirely used for queries, not transactions, then have no fear. Indexes are what you want. In practice, indexes will have negligible impact on create, read, and delete operation performance. But I would be remiss if I didn't mention this point.
How Do I Know Which Indexes I Need?
To figure out which indexes you should create, I suggest you look at all the find calls that you make in your Rails code. The :conditions you use tell you which indexes you need. Here’s an example of two separate queries:
@article_list = Article.find(:all,
:conditions => [’created_at < ? AND comments_count > ?’, Time.now, 0])
@article = Article.find(:first, :conditions => [’created_at < ?’, Time.now])
Both of those queries will benefit from an index on created_at (first) and comments_count (second):
add_index :articles, [:created_at, :comments_count]
It’s that easy! This is why developers and database admins need to talk to each other. Database admins won't know which indexes to create if they don't know what queries the developers are using.
Final Cheeky Thoughts
If your Rails app is suffering from performance problems, the solution might be as easy as adding a missing index. With the right index, any query should run in a fraction of a second. If you measure your database performance in seconds rather than milliseconds, then it might be time to look at all your queries and indexes to see if you’re missing something.
Also, the on-disk size of an index depends on the columns of that index. So, if you’ve got a string column that has a length measured in hundreds or thousands, than you should probably not include it in an index. Your VARCHAR(32000) columns are not the best candidates for indexes. Use them only if you have a very good reason.
Well, that was a long one. I hope this helps some Rails folks develop the fastest web sites on the Internet! Please let me know if you have any questions or additions for this article.
If you found this post useful, why not bookmark my Ruby On Rails content listing? That page will be updated whenever I tag an article with " Ruby on Rails". I've got another article about database transactions in mind, which is another infrequently used feature of Rails that I think everyone should be using. Anyway, don't get me started. I'll stop now.
Tags:
Programming
Ruby on Rails
 Here’s an article for the Ruby on Rails programmers out there. Geek Skillz is home-made using Rails, as an exercise for me to learn this popular framework. Rails makes it very easy to use databases with your web sites, which is a great thing, but with a little problem. As I search for Rails info on the web, I find that a lot of writers don’t seem to have the database background to understand performance issues. They let Rails deal with the database and think nothing more about it. I think it’s very important that if you use databases in your code, in whichever language and framework you choose, you should take time to learn about them in depth!
DISTINCT in Rails
I was looking into a way to list all the values that appear in one column of a table. I didn’t want all the rows, just the unique values that appear in a column. In SQL, this is done with the “DISTINCT” keyword. For example, “SELECT DISTINCT year FROM blog_posts” would list all the years that are in the year column of the blog_posts table.
Click to continue reading...
So, how do you do a SELECT DISTINCT query with Rails? Here’s one solution that I found (I won’t post a link to the site since the author ought to be embarrassed!):
@distinctlist = Item.find(:all).map{ |i| i.fieldname }.uniq
That would certainly work. But what if your “item” table has thousands of rows? Millions of rows? This single line of code could take a very, very long time to complete! I mean, minutes or hours! The programmer might look at that and think it is elegant and compact, but won’t think so when it brings the server to its knees.
How does that statement work? First, it executes this:
Item.find(:all)
That will fetch all columns in all rows from the items table, without exception, and load it into memory. No query can perform more slowly than that, or use more memory. Once that horrendous step is done, all the data is passed to the map function. Once that step is done, all that data is then passed to the uniq function. If you do this on a table that will always have a few rows, then you’re fine. Otherwise, your website visitors will leave your site before the query even completes.
The better solution is to use the :select parameter of the ActiveRecord find method. Here’s an example:
Item.find( :all, :select => 'DISTINCT fieldname' )
The :select parameter is overriding part of Rails’ query to the database. Rather than letting Rails build the entire query, we are telling it which columns to select and how to query them. Rails will still generate most of the query, but use our select list instead.
The advantage of this method is that we pass all the work to the database. Databases are designed to be the very best at analyzing and retrieving data very quickly. That’s what they do! The database has indexes (which you should have created in the database migration) to perform the DISTINCT processing for us in sub-second time. This is far better than telling the database to give us all the data from the table so that we can deal with it using Ruby ourselves. Make the database work for you. Don’t reproduce database functionality in your Ruby code.
Note that my example will only give you one column from the table: fieldname. In the Item objects that get instantiated, you won’t be able to access the other attributes of the model. But that makes sense, given our query. We want to know the unique values of “fieldname”, so we don’t need the other columns of the table. That wouldn't make sense. Which brings us to another benefit: minimal data is sent from the database to your Rails app, lowering memory usage.
Improving Performance with :select
What else can you do with the :select option? You can tailor your queries to give you only the information you need, which can improve performance greatly in some situations. For example, say I have a table of text articles (like I do with this blog). It could have these columns:
created_at (datetime)
title (string)
bodytext (string, containing the entire text of an article)
comments_count (integer)
The bodytext column will have a huge amount of data compared to the other three columns. After years of posting articles on my blog, there can be hundreds of rows in this table. If I want to fetch many of the rows in this table, but don’t need to use the bodytext column, then I can improve performance and memory usage by excluding bodytext from the query, like this:
@articles = Article.find( :all, :select => ‘created_at, title, comments_count’ )
This will use a tiny amount of memory and be very fast compared to the output of this query:
@articles = Article.find( :all )
Using :select means that you need to be mindful of how the query result is used. If a part of my code tries to do article.bodytext on an element, it will find that bodytext does not exist. But I’m not suggesting that you use :select for general use queries. You’ll use it for specific situations where you know what you need and when it will be used.
I hope this is useful to someone. In general, I think carefully whenever I use find(:all) in Rails. Do my :conditions limit the query enough? Is there an index for the :conditions? Can I use :limit? Is there something in the :select option that I should be using to improve performance and memory usage? Those are the kind of things that should pop into your mind too whenever you use ActiveRecord’s find method.
|
About Me

Loot For Geeks:



|