Here we see some important info about VoIP, needed to understand it.
Overview on a VoIP connection
To setup a VoIP communication we need:
- First the ADC to convert analog voice to digital signals (bits)
- Now the bits have to be compressed in a good format for transmission: there is a number of protocols we'll see after.
- Here we have to insert our voice packets in data packets using a real-time protocol (typically RTP over UDP over IP)
- We need a signaling protocol to call users: ITU-T H323 does that.
- At RX we have to disassemble packets, extract datas, then convert them to analog voice signals and send them to sound card (or phone)
- All that must be done in a real time fashion cause we cannot waiting for too long for a vocal answer! (see QoS section)
Base architecture
Voice )) ADC - Compression Algo - Assembling RTP in TCP/IP ----
----> |
<---- | Voice (( DAC - Decompress. Algo - Disass. RTP from TCP/IP ----
Analog to Digital Conversion
This is made by hardware, typically by card integrated ADC.
Today every sound card allows you convert with 16 bit a band of 22050 Hz (for sampling it you need a freq of 44100 Hz for Nyquist Principle) obtaining a throughput of 2 bytes * 44100 (samples per second) = 88200 Bytes/s, 176.4 kBytes/s for stereo stream.
For VoIP we needn't such a throughput (176kBytes/s) to send voice packet: next we'll see other coding used for it.
Compression Algorithms
Now that we have digital data we may convert it to a standard format that could be quickly transmitted.
PCM, Pulse Code Modulation, Standard ITU-T G.711
- Voice bandwidth is 4 kHz, so sampling bandwidth has to be 8 kHz (for Nyquist).
- We represent each sample with 8 bit (having 256 possible values).
- Throughput is 8000 Hz *8 bit = 64 kbit/s, as a typical digital phone line.
- In real application mu-law (North America) and a-law (Europe) variants are used which code analog signal a logarithmic scale using 12 or 13 bits instead of 8 bits (see Standard ITU-T G.711).
ADPCM, Adaptive differential PCM, Standard ITU-T G.726
It converts only the difference between the actual and the previous voice packet requiring 32 kbps (see Standard ITU-T G.726).
LD-CELP, Standard ITU-T G.728
CS-ACELP, Standard ITU-T G.729 and G.729a
MP-MLQ, Standard ITU-T G.723.1, 6.3kbps, Truespeech
ACELP, Standard ITU-T G.723.1, 5.3kbps, Truespeech
LPC-10, able to reach 2.5 kbps!!
This last protocols are the most important cause can guarantee a very low minimal band using source coding; also G.723.1 codecs have a very high MOS (Mean Opinion Score, used to measure voice fidelity) but attention to elaboration performance required by them, up to 26 MIPS!
RTP Real Time Transport Protocol
Now we have the raw data and we want to encapsulate it into TCP/IP stack. We follow the structure:
VoIP data packets
RTP
UDP
IP
I,II layers
VoIP data packets live in RTP (Real-Time Transport Protocol) packets which are inside UDP-IP packets.
Firstly, VoIP doesn't use TCP because it is too heavy for real time applications, so instead a UDP (datagram) is used.
Secondly, UDP has no control over the order in which packets arrive at the destination or how long it takes them to get there (datagram concept). Both of these are very important to overall voice quality (how well you can understand what the other person is saying) and conversation quality (how easy it is to carry out a conversation). RTP solves the problem enabling the receiver to put the packets back into the correct order and not wait too long for packets that have either lost their way or are taking too long to arrive (we don't need every single voice packet, but we need a continuous flow of many of them and ordered).
Real Time Transport Protocol
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|X| CC |M| PT | sequence number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| timestamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| synchronization source (SSRC) identifier |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| contributing source (CSRC) identifiers |
| .... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Where:
- V indicates the version of RTP used
- P indicates the padding, a byte not used at bottom packet to reach the parity packet dimension
- X is the presence of the header extension
- CC field is the number of CSRC identifiers following the fixed header. CSRC field are used, for example, in conference case.
- M is a marker bit
- PT payload type
For a complete description of RTP protocol and all its applications see relative RFCs 1889 and 1890.
RSVP
There are also other protocols used in VoIP, like RSVP, that can manage Quality of Service (QoS).
RSVP is a signaling protocol that requests a certain amount of bandwidth and latency in every network hop that supports it.
For detailed info about RSVP see the RFC 2205
Quality of Service (QoS)
We said many times that VoIP applications require a real-time data streaming cause we expect an interactive data voice exchange.
Unfortunately, TCP/IP cannot guarantee this kind of purpose, it just make a "best effort" to do it. So we need to introduce tricks and policies that could manage the packet flow in EVERY router we cross.
So here are:
- TOS field in IP protocol to describe type of service: high values indicate low urgency while more and more low values bring us more and more real-time urgency
- Queuing packets methods:
- FIFO (First in First Out), the more stupid method that allows passing packets in arrive order.
- WFQ (Weighted Fair Queuing), consisting in a fair passing of packets (for example, FTP cannot consume all available bandwidth), depending on kind of data flow, typically one packet for UDP and one for TCP in a fair fashion.
- CQ (Custom Queuing), users can decide priority.
- PQ (Priority Queuing), there is a number (typically 4) of queues with a priority level each one: first, packets in the first queue are sent, then (when first queue is empty) starts sending from the second one and so on.
- CB-WFQ (Class Based Weighted Fair Queuing), like WFQ but, in addition, we have classes concept (up to 64) and the bandwidth value associated for each one.
- Shaping capability, that allows to limit the source to a fixed bandwidth in:
- download
- upload
- Congestion Avoidance, like RED (Random Early Detection).
For an exhaustive information about QoS see Differentiated Services at IETF.
H323 Signaling Protocol
H323 protocol is used, for example, by Microsoft Netmeeting to make VoIP calls.
This protocol allow a variety of elements talking each other:
- Terminals, clients that initialize VoIP connection. Although terminals could talk together without anyone else, we need some additional elements for a scalable vision.
- Gatekeepers, that essentially operate:
- address translation service, to use names instead IP addresses
- admission control, to allow or deny some hosts or some users
- bandwidth management
- Gateways, points of reference for conversion TCP/IP - PSTN.
- Multipoint Control Units (MCUs) to provide conference.
- Proxies Server also are used.
h323 allows not only VoIP but also video and data communications.
Concerning VoIP, h323 can carry audio codecs G.711, G.722, G.723, G.728 and G.729 while for video it supports h261 and h263.
More info about h323 is available at Openh323 Standards, at this h323 web site and at its standard description: ITU H-series Recommendations.
You can find it implemented in various application software like Microsoft Netmeeting, Net2Phone, DialPad, ... and also in freeware products you can find at Openh323 Web Site.
Hardware requirement
To create a little VoIP system you need the following hardware:
- PC 386 or more
- Sound card, full duplex capable
- a network card or connection to internet or other kind of interface to allow communication between 2 PCs
All that has to be present twice to simulate a standard communication. The tool above are the minimal requirement for a VoIP connection: next we'll see that we should (and in Internet we must) use more hardware to do the same in a real situation. Sound card has be full duplex unless we couldn't hear anything while speaking! As additional you can use hardware cards (see next) able to manage data stream in a compressed format (see Par 4.3).
Hardware accelerating cards
We can use special cards with hardware accelerating capability. Two of them (and also the only ones directly managed by the Linux kernel at this moment) are the
- Quicknet PhoneJack
- Quicknet LineJack
- VoiceTronix V4PCI
- VoiceTronix VPB4
- VoiceTronix VPB8L
Quicknet PhoneJack is a sound card that can use standard algorithms to compress audio stream like G723.1 (section 4.3) down to 4.1 Kbps rate.
It can be connected directly to a phone (POTS port) or a couple mic-speaker.
It has a ISA or PCI connector bus.
Quicknet LineJack works like PhoneJack with some addition features (see next).
VoiceTronix V4PCI is a PCI card pretty like Quicknet LineJack but with 4 phone ports
VoiceTronix VPB4 is a ISA card equivalent to V4PCI.
VoiceTronix VPB8L is a logging card with 8 ports.
For more info see Quicknet web site and VoiceTronix web site
Hardware gateway cards
Quicknet LineJack and VoiceTronix cards can be connected to a PSTN line allowing VoIP gateway feature.
Then you'll need a software to manage it (see after).
Software requirement
We can choose what O.S. to use:
- Win9x
- Linux
Under Win9x we have Microsoft Netmeeting, Internet Phone, DialPad or others or Internet Switchboard (from Quicknet web site) for Quicknet cards.
Warning!!: Latest Quicknet cards using Swithboard (older version too) NEED to be connected to Internet to get working for managing Microtelco account (not free of charge), so if you plan to remain isolated from Internet you need to install OpenH323 software.
For VoiceTronix cards you can find software at VoiceTronix web site
Under Linux we have free software GnomeMeeting, a clone of Microsoft Netmeeting, while in console mode we use (also free software) applications from OpenH323 web site: simph323 or ohphone that can also work with Quicknet accelerating hardware.
Attention: all Openh323 source code has to be compiled in a user directory (if not it is necessary to change some environment variable). You are warned that compiling time could be very high and you could need a lot of RAM to make it in a decent time.
Gateway software
To manage gateway feature (join TCP/IP VoIP to PSTN lines) you need some kind of software like this:
- Internet SwitchBoard (only when connected to Internet) for Windows systems also acting as a h323 terminal;
- PSTNGw for Linux and Windows systems you download from OpenH323.
Gatekeeper software
You can download as gatekeeper the Free product Openh323 Gatekeeper (GK) from here.
Version 2.0 of it supports "proxy function" to enabe talking from/to a private network.
Other software
In addition I report some useful software h323 compliant:
- Phonepatch, able to solve problems behind a NAT firewall. It simply allows users (external or internal) calling from a web page (which is reachable from even external and internal users): when web application understands the remote host is ready, it calls (h323) the source telling it all is ok and communication can be established. Phonepatch is a proprietary software (with also a demo version for no more than 3 minutes long conversations) you download from here.
Same function can be obtained using "Proxy" function of Gatekeeper Gnugk (see before).
Here we see how to configure special hardware card in Linux and Windows environment.
Quicknet PhoneJack
As we saw, Quicknet Phonejack is a sound card with VoIP accelerating capability. It supports:
- G.711 normal and mu/A-law, G.728-9, G.723.1 (TrueSpeech) and LPC10.
- Phone connector (to allow calling directly from your phone) or
- Mic & speaker jacks.
Quicknet PhoneJack is a ISA (or PCI) card to install into your Pc box. It can work without an IRQ.
Software installation
Under Windows you have to install:
- Card driver
- Internet Switchboard application (working only with Internet, using newer Quicknet cards)
all downloadable from Quicknet web site
After Switchboard has been installed, you need to register to Quicknet to obtain full capability of your card.
When you pick up the phone Internet Switchboard wakes up and waits for your calling number (directly entered from your phone), you can:
- enter an asterisk, then type an IP number (with asterisks in place of dot) with a # in the end
- type directly a PSTN phone number (with international prefix) to call a classic phone user. In this case you need a registration to a gateway manager to which pay for time.
- enter directly a quick dial number (up to 2 digits) you have previously stored which make a call (IP or PSTN).
Internet Swichboard is h323 compatible, so if you can use, for example, Microsoft Netmeeting at the other end to talk.
Warning!! Internet Switchboard NEED to be connected to Internet when used with newer Quicknet cards
In place of Internet Switchboard you can use openh323 application openphone (using GUI) or ohphone (command line).
Under Linux you have to install:
- Card driver, from Quicknet web site. After downloaded you have to compile it (you must have a /usr/src/linux soft or hard link to your Linux source directory): type make for instructions.
- Application openphone or ohphone.
- If you are a developer you can use SDK to create your own application (also for Windows).
Settings
With Internet Switchboard (and with other application) you can:
- Change compression algorithm preferred
- Tune jitter delay
- Adjust volume
- Adjust echo cancellation level.
Quicknet LineJack
This card is very similar to the previous, it supports also gateway feature.
We only notice that we have to download PSTNGx application (for Linux and Windows) or we use Internet Switchboard to gateway feature.
VoiceTronix products
- First download software here
- Untar it
- Modify 'src/vpbreglinux.cpp' according to file README
- type 'make'
- type 'make install'
- cd to src
- type 'insmod vpb.o'
- retrieve (from console of from 'dmesg' output command) major number, say MAJOR
- type 'mknod /dev/vpb0 c MAJOR 0' where MAJOR is the above number
- cd to unittest and type './echo'
Follow README file for more help.
I personally haven't tested VoiceTronix products so please contact VoiceTronix web site for support.
In this chapter we try to setup VoIP system, simple at first, then more and more complex.
Simple communication: IP to IP
A (Sound card) - - - B (Sound card)
192.168.1.1 - - - 192.168.1.2
192.168.1.1 calls 192.168.1.2 and viceversa.
A and B should have
- an application like Microsoft Netmeeting, Internet Switchboard, Openh323 (under Windows environment) or Ohphone, Gnomemeeting (under Linux), installed and properly configured.
- a network card or other kind of TCP/IP interface to talk each other.
In this kind of view A can make a H323 call to B (if B has server side application active) using B IP address. Then B can answer to it if it wants. After accepting call, VoIP data packets start to flow.
Using names
Under Microsoft Windows a NetBIOS name can be used instead of an IP address.
A - - - B
192.168.1.1 - - - 192.168.1.2
John - - - Alice
John calls Alice.
This is possible cause John call request to Alice is converted to IP calling by the NetBIOS protocol.
The above 2 examples are very easy to implement but aren't scalable.
In a more big view such as Internet it is impossible to use direct calling cause, usually, the callers don't know the destination IP address. Furthermore NetBIOS naming feature cannot work cause it uses broadcast messages, which typically don't pass ISP routers .
You can also use DNS to solve name in IP address: for example you can call ''box.domain.com''.
Internet calling using a WINS server
The NetBIOS name calling idea can be implemented also in a Internet environment, using a WINS server: NetBIOS clients can be configured to use a WINS server to resolve names.
PCs using the same WINS server will be able to make direct calling between them.
A (WINS Server is S) - - - - I - - - - B (WINS Server is S)
N
T
E - - - - - S (WINS Server)
C (WINS Server is S) - - - - R
N
E - - - - D (WINS Server is S)
T
Internet communication
A, B, C and D are in different subnets, but they can call each other in a NetBIOS name calling fashion. The needed is that all are using S as WINS Server.
Note: WINS server hasn't very high performance cause it use NetBIOS feature and should only be used for joining few subnets.
ILS server
ILS is a kind of server which allows you to solve your name during an H323 calling: when you start VoIP application you first register to ILS server using a name, then everyone will be able to see you using that name (if he uses same Server ILS!).
A big problem: the masquering.
A problem of few IPs is commonly solved using the so called masquering (also NAT, network address translation): there is only 1 IP public address (that Internet can directly "see"), the others machines are "masqueraded" using all this IP.
A - - -
B - - - Router with NAT - - - Internet
C - - -
This doesn't work
In the example A,B and C can navigate, pinging, using mail and news services with Internet people, but they CANNOT make a VoIP call. This because H323 protocol send IP address at application level, so the answer will never arrive to source (that is using a private IP address).
Solutions:
- there is a Linux module that modifies H323 packets avoiding this problem. You can download the module here. To install it you have to copy it to source directory specified, modify Makefile and go compiling and installing module with "modprobe ip_masq_h323". Unfortunately this module cannot work with ohphone software at this moment (I don't know why).
A - - - Router with NAT
B - - - + - - - Internet
C - - - ip_masq_h323 module
This works
- There is a application program that also solves this problem: for more see Par 5.7
A - - -
B - - - PhonePatch - - - Internet
C - - -
This works
Open Source applications
Ohphone Sintax
Sintax is:
"ohphone -l|--listen [options]"
"ohphone [options]... address"
- "-l", listen to standard port (1720)
- "address", mean that we don't wait for a call, but we connect to "address" host
- "-n", "--no-gatekeeper", this is ok if we haven't a gatekeeper
- "-q num", "--quicknet num", it uses Quicknet card, device /dev/phone(num)
- "-s device", "--sound device", it uses /dev/device sound device.
- "-j delay", "--jitter delay", it change delay buffer to "delay".
Also, when you start ohphone, you can give command to the interpreter directly (like decrease AEC, Automatic Echo Cancellation).
Gnomemeeting
Gnomemeeting is an application using GUI interface to make call using VoIP. It is very simple to use and allows you to use ILS server, chat and other things.
Setting up a gatekeeper
You can also experiment gatekeeper feature
Example
(Terminal H323) A - - -
\
(Terminal H323) B - - - D (Gatekeeper)
/
(Terminal H323) C - - -
Gatekeeper configuration
- Hosts A,B and C have gatekeeper setting to point to D.
- At start time each host tells D own address and own name (also with aliases) which could be used by a caller to reach it.
- When a terminal asks D for an host, D answers with right IP address, so communication can be established.
We have to notice that the Gatekeeper is able only to solve name in IP address, it couldn't join hosts that aren't reachable each other (at IP level), in other words it couldn't act as a NAT router.
You can find gatekeeper code here: openh323 library is also required.
Program has only to be launch with -d (as daemon) or -x (execute) parameter.
In addition you can use a config file (.ini) you find here.
Setting up a gateway
As we said, gateway is an entity that can join VoIP to PSTN lines allowing us to made call from Internet to a classic telephone. So, in addition, we need a card that could manage PSTN lines: Quicknet LineJack does it.
From OpenH323 web site we download:
- driver for Linejack
- PSTNGw application to create our gateway.
If executable doesn't work you need to download source code and openh323 library, then install all in a home user directory.
After that you only need to launch PSTNGw to start your H323 gateway.
Compatibility Matrix
First Matrix refers to:
- Software intercommunications (i.e. Netmeeting with SwitchBoard)
- Software/Driver/Hardware talking (i.e. Netmeeting can use a PhoneJACK card).
| Netmeeting | SwitchBoard | Simph323 | OhPhone | LinPhone | Speak-Freely | HW PhoneJACK | HW LineJACK |
Netmeeting | V | V | V | V | X | X | V | V |
SwitchBoard | V | V | V | V | X | X | V | V |
Simph323 | V | V | V | V | X | X | X | X |
OhPhone | V | V | V | V | X | X | V | V |
LinPhone | X | X | X | X | V | X | X | X |
SpeakFreely | X | X | X | X | X | V | X | X |
HW PhoneJACK | V | V | X | V | X | X | - | - |
HW LineJACK | V | V | X | V | X | X | - | - |
Second Matrix refers to Gateway softwares that manage LineJACK card.
___________________________________________________________
| |HW LineJACK GW| SwitchBoard | PSTNGW |
|______________|______________|______________|______________|
|HW LineJACK GW| _ | V | V |
|______________|______________|______________|______________|
| SwitchBoard | V | _ | _ |
|______________|______________|______________|______________|
| PSTNGW | V | _ | _ |
|______________|______________|______________|______________|
Notation:
- V : Works
- X : Doesn't Work
- -- : Doesn't care