Tuesday, February 14, 2006

Setting up the config.ini file for NDB

The config.ini file is the ndb_mgm NDB management process directive. There are Global settins and specific settings for the various components of NDB.

Here are my settings

[NDBD DEFAULT]

NoOfReplicas=2# Number of Replicas: This defines what Node groups contain the
# same data for HA.
# So, say we have 8 Data notes with ids 2,3,4,5,6,7,8,9
# Node Group 1: 2,3 Node Group 2: 4,5 Node Group 3: 6,7
# Node Group 4: 8,9
# To make sure you don't loose data don't stick a Node Group on
# the Same box: i.e. ID 2,3
# should not be the same box.

DataMemory=6G # How much memory to allocate for database records and indexes
# All data is stored in FIX size, so a VARCHAR(255) will take 255 bytes
# Each record also has some overhead. 16-bytes since each record is
# stored in a
# 32 KB page with 128 byte page overhead.
# CREATE TABLE test (
# a INT NOT NULL,
# b INT NOT NULL,
# c INT NOT NULL,
# PRIMARY KEY (a),
# UNIQUE KEY (b)
# )
#
# 3 * 4 = 12 bytes + fixed overhead 12 bytes
# (4 bytes are saved from fixed since no null columns)
# == 24 byte Record size
# DataMemory also stores Index Data, it's called ordered data;
# 10 bytes to be exact a(4)+1byte + b(4)+1byte = 10 byte overhead
# per record.
# 34 byte Record Size
# so 963 Data records fits in a single page


#
IndexMemory=5G # How much memory to allocate for index storage, this is really
# mis labled it should be HashIndexMemory
# The primary key and unique key are hash indexes and they are
# implemented in a way
# where each key has a 25 byte overhead + size of the column
# In our example it's 29 bytes for the Primary Key
# 29 bytes for the Unique Key + 8 bytes for b+a + 12 bytes of
# overhead (4 bytes saved)
# since the Unique key is translated into an internal table with b as the PRIMARY and a as a column.
#

# Management process options:
[NDB_MGMD]
hostname=66.94.XXX.XXX # Hostname or IP address of MGM node
datadir=/local1/mysql-cluster # Directory for MGM node logfiles

[NDBD]
hostname=66.94.XXX.XXX
datadir=/local1/mysql/data

[NDBD]
hostname=66.94.XXX.XXX
datadir=/local1/mysql/data

[NDBD]
hostname=66.94.XXX.XXX
datadir=/local1/mysql/data
[NDBD]
hostname=66.94.XXX.XXX
datadir=/local1/mysql/data




[MYSQLD]
[MYSQLD]
[MYSQLD]
[MYSQLD]


my my.cnf settings


# -*- sh -*-
# my.cnf
#
[mysqld]
port = 3306
socket = /var/lib/mysql/mysql.sock
user = mysql
skip-locking
default_table_type = innodb
server-id = 12
slave-skip-errors = 1062
#
# used for myISAM table buffers and temp tables which are myISAM tables
#
key_buffer = 32M
#
#If any database uses text or blob types raise this on the master
#
max_allowed_packet = 16M
#
# this is used in part for setting the unlimit and it's only good with myISAM
#
table_cache=256
#
# sort_buffer is used for GROUP BY and ORDER BY.
# It's a thread specfic buffer so it allocates up to the setting max when needed
# Set to 512K if lots of simultaneous connections
sort_buffer_size = 2M

#
# read_buffer_size is used for sequential scan. Nice for limit through the list
#
read_buffer_size = 1M

#expected length of queries. The communication buffer
# is reset to this size between queires
net_buffer_length=8K
#
# max_seeks_for_key - tell mysql to prefer indexes over
# table scans when it thinks that a table scan is better
#
max_seeks_for_key = 100

myisam_sort_buffer_size=16M
max_connect_errors=999999999
max_connections=400

thread_cache=30

#
# this is the amount of memory used before a tmp_file goes to disk.
# Its also used for internal data structures
#
tmp_table_size = 64M

datadir = /var/lib/mysql

tmpdir = /local1/mysql

interactive_timeout = 60
wait_timeout = 60

## avoid DNS problems
skip-name-resolve

## Slow Query Log

long_query_time=3
# cluster-specific settings
ndbcluster
ndb-connectstring=66.94.XXX.XXX:1186

Wednesday, February 08, 2006

Setting up NDB - Installing from RPM SRC

So, since many of the pre-compiled provided by dev.mysql.com/downloads are not what I'm looking for I decided to make my own RPM packages.

Here is my install script

!/bin/bash

export CC=/usr/bin/gcc32
export CXX=/usr/bin/gcc32
export LD_LIBARY_PATH=$LD_LIBARY_PATH:/lib64:/usr/lib64
export CFLAGS="-O3 -static -fomit-frame-pointer -ffixed-ebp"
export CXXFLAGS="-03 -felide-constructors -fno-exceptions -fno-rtti"

rpmbuild -ba mysql-5.0.18.spec

This is on a RHAS-4.0 .6.9-22.12.y1-64 #5 SMP Fri Nov 4 13:39:56 PST 2005 x86_64 x86_64 x86_64 GNU/Linux
Using

libtool-libs-1.5.6-4.EL4.1
libtool-1.5.6-4.EL4.1
autoconf-2.59-5
automake17-1.7.9-5
automake-1.9.2-3

I also changed some of the spec options as well, I like static builds, dynamic builds do not provide as much performance and are subject to failure when a lib changes.
#
BuildMySQL "--disable-shared %if %{STATIC_BUILD}
--with-mysqld-ldflags='-all-static' --with-client-ldflags='-all-static' $USE_OTHER_LIBC_DIR %else
--with-zlib-dir=bundled %endif
--with-comment=\"MySQL Community Edition - Flickr\" --with-server-suffix='%{server_suffix}' --with-archive-storage-engine --with-innodb --with-big-tables --with-ndbcluster --with-federated-storage-engine --with-big-tables"





compile time takes about 30 min now the next step is to set NDB up.

Evaluating MySQL cluster (NDB)

Well, I've introduced a new Architecture to Flickr.com, called Federation. Federation is the idea that data is better to manage if it where spread across many servers instead of duplicated from a single master to many slaves. More on this later.

Now for NDB, mySQL's cluster Database, I'm evaluating it to see if it can provide a HA solution for a component of Federation. Federation in itself is HA, yet how do you know how to get to a particular shard?

Well there is a global lookup table to tell the requester where to go. This global lookup table needs to be HA, and here is where NDB comes in. NDB is very slow for range type queries so it doesn't make sense to use it for the entire federated arch, but it's reportedly good for primary key lookups, and this fits perfect for our lookup table.


More to come.