After having gone through about a handful of SIP clients, I have to say that baresip is by far the most reliable one I have found. No random breakages. Accounts just work once you have the configuration done. It has some idiosyncrasies in terms of its configuration layout due to how tries to be possibly too modular and wide in scope, but it is still wonderful to use.
This is a bit off-topic but I'm wondering if I'm going to have to hack baresip to implement it:
Is there such thing as a SIP to mobile phone bridge, so that you can make phone calls over the net via a mobile phone that is in some remote location?
The application in my case is making local calls in a country that is expensive to call into. If it exists its the kind of thing that is resistant to search, unless it's called something I'm not aware of.
Edit: I think I might be looking for a SIP GSM Gateway running on an Android phone
To clarify I'm looking for:
initiate voip handset call => internet => call is initiated from remote mobile phone
remote mobile phone receives call => internet => voip handset rings
Or put another way something that exposes the Android phone as a trunk line.
You can try a software PBX solution -- I've spun up quite a few 3CX deployments using Twilio as the SIP provider. Then you can call in a number of ways (super configurable routing inbound and outbound). There's a "softphone" mobile app for internet -> sip calling on smart phones. 3CX has pretty reasonable licensing (I think there may still be a free tier) and Twilio is super cheap.
I am using baresip on my phone(from fdroid). I don't have my whole stack plumbed yet so I can't say how well it works. but it connects to my asterisk server via wifi well enough. I was expecting sip to be built into android but apparently it was removed in 10 or 11. but found the baresip package. I like baresip well enough for desktop use so I decided to try it on the phone.
It is one of those slow burn projects I work on every couple of months. mainly because I hate doing anything on the phone. but the idea is I have an asterisk server at the office that ties into the dialed phone world via sip trunk. I should be able to put a sip client on the phone, tunnel a secure connection to the office then the cell phone is just another office phone... anywhere in the world.
Only piece left is the tunnel, I have my heart set on ipsec. is this folly? should I be looking into other tunnel tech?
ipsec for the control plane (SIP) might be OK, but you'll probably run into jitter and latency issues with the media (RTP/RTCP). Also RTP has provisions for packet loss, but end-to-end encryption won't like that. SIP already has an encrypted mode [1], but I don't know how well it's currently supported in the wild. The purpose-built secure media protocols are Secure RTP/Secure RTCP [2].
Getting just plain SIP and RTP through consumer, enterprise, and cell carrier nets can be troublesome. For example, SIP especially is NAT-antagonistic, and edge routers can try to be "helpful" and do half-baked proxying or deep packet mangling that just messes up your smarter VoIP server. Consumer ISPs have a tendency to block SIP altogether. Throwing in encryption complications may give you another layer of hard-to-diagnose headaches. (Of course, if you can get the secure layer through the net it might force intermediaries to keep their fingers out. Other tricks like using non-standard ports can help with that, too.)
Source: used to run the signalling/media dev team for a VoIP provder many moons ago, but we didn't bother with the secure modes/protocols.
VoWifi is using an ipsec tunnel for both SIP/RTP to the providers IMS stack. With its popularity I think it's likely network admins increasingly allow for it in networks you do not control.
Very nice library. I used it some time ago to implement a module that applied ASR and other audio analytics to audio streams in real time and relayed extracted information via SIP messages. As with many projects in that space, it doesn’t come with a lot of documentation.
It says SIP supports instant messaging. Never knew that. I wonder why it is not a popular protocol for all these "instant messengers" out there (especially since it also support P2P)? How does it compare to Matrix?
Protocol popularity is like programming language popularity. It's not enough to be technically competent (though I don't know if SIP is). You need marketing, big enough mindshare/uptake for network effects to kick in, cultural background match (telco world in the case of SIP), a "killer app", etc.
SIP’s instant messaging support and capabilities are basically the same as SMS. You can also negotiate MSRP over it, which adds group chat, which is what RCS does (the thing Google Messages speaks), but stuff like scrollback and e2e encryption is missing. It also doesn’t put an emphasis on extensibility, unlike XMPP & Matrix.
SIP is a protocol for Voice over Internet Protocol or VOIP.
SIP is the control channel and you use it to login to the server (once registered calls to a number will be sent to your program) and control both sending and receiving calls. The actual audio protocol is RTP, typically 20ms of audio is sent or received per RTP packet.
SIP is more or less the signaling protocol for telephone service as VoIP. SIP doesn't include the actual audio, which is usually RTP on a different socket, but this client seems to do audio too.
If you've got VoIP service for a landline, it's probably SIP. I think VoLTE might be SIP too? There's extensions for text messaging over SIP as well.
VoLTE and VoNR are just SIP over a cell carrier prioritized data channel (Google LTE QCI). T-Mobile has a beta for latency sensitive applications to access the 5G equivalent of this, it's called Network Slicing.
UMTS an similar 2G and 3G protocols could do neat things like have multiple towers compare the signals received from your phone to enable more reliable calling, but this functionality did not survive to make it into cellular's LTE or 5G New Radio (NR) standards.
Last I worked with telephony (around 2010), IIRC we recieved calls over 3G network as SS7/H323 using Dialogic hardware and converted them on the fly to SIP as they entered our systems.
I'm really intrigued. Is it possible to get video calling working with mobile networks' own video calling support over LTE? I could never find how it's done.
I've not done that personally, but I'd have a look on the freeswitch forums to see if someone on there has already done it.
I have had the free 3CX sip client on an android mobile, using a UDP vpn back to a freeswitch server which was connected to the UK phone system. Proof of concept thing.
Because of the nature of the mobile data, which is like bursts of data, it made it hard to have a conversation even around midnight and 1am when cell traffic volumes are low.
I was told UK Telco's give voice data priority as this needs to be realtime and then other data has lower levels of priority making it hard to have a reliable stream of data.
FYI, 56K is the minimum standard for voice calls, although with 4G and beyond, its possible to hear the higher bitrate protocols being used on the main mobile networks, as its not so muffled, like someone turning up the high end frequency's on a graphic equaliser, and they are fully down with the 56k protocols.
I dont know what the minimum video protocol bitrate would be, but I did find this which might be useful.
> H.264 will only consume around 10 kbps with around 2 fps in 176x144.
When Skype first came out, it was peer to peer, so as a proof of concept to test the new 3G network I did manage to maintain a voice call for an hour which was impressive, but also shows that mobile networks can maintain the mobile data streams if its not encrypted inside a VPN in the UK.
> I have had the free 3CX sip client on an android mobile, using a UDP vpn back to a freeswitch server which was connected to the UK phone system. Proof of concept thing.
I don't think any of the US networks do that, unless you want to set up your own phone operator :(
Just for the record, that isn’t true. H.323 is a VoIP protocol and is implemented on a lot of private PBX systems. Most on Prem systems can handle SIP or H.323, likewise for about any standard enterprise IP phone. H.323 works out of the box, pure SIP requires additional infrastructure, mainly SBC or session boarder controller.
As the other comments have noted, it's the "control plane" that tells all the Voice-over-IP parts how to set up and tear down a call.
As for the complications... Yeah.
The folks who invented VoIP had a dream. They saw the existing extremely centralized, extremely locked-down phone carriers as The Enemy. Instead, they envisioned a loose decentralized federation of peers who directly contacted one another over the Internet to make calls.
Since there weren't supposed to be any all-knowing, all-powerful, all-standard-enforcing choke points, everything had to be negotiated and a consensus built up across the interconnected parts as to the call state and control. Worse, intermediate services (proxies) were allowed to be stateless. So the consensus has to be continually reinforced during and after the call.
Another design choice was to split functionality into lots of optional tiny pieces. "SIP" has a bazillion RFCs (standards specs), and some pretty byzantine control flows to implement all the interconnections. (To be fair, The Enemy is just as bad or worse. Both camps were dealing with extremely limited computing power and bandwidth in the distant past, plus incremental feature enhancement.)
My poster child is "call parking". This allows you to "park" a call from one phone by putting it on hold, and then picking it on another. The suggested SIP call from (don't remember which RFC it's part of) involves specialized "parking services"; all these do is receive a call and then later forward it. The minimal flow diagram has dozens of distinct steps involving half-a-dozen players.
SIP was a standard, not proprietary, and "good enough", so became the overwhelming standard for interconnecton of VoIP players. In reality, the vast majority of conrol flows are:
1. Phone says "hey central server, here's a phone number. Call them."
2. Central server says "Hey phone, here's an incoming call."
3. Phone says: "OK, I'll answer".
4. Phone to server: "Hang up".
5. Server to phone: "Hang up".
6. Phone: "Here's some random buttons my user pushed mid-call." (This includes digits for IVR menu trees, and on/off hold).
Audio almost always uses one of three or four popular encoding choices: 1) "Dead stupid" (G.711), 2) "Bad compression" (G.729), 3) "Oooh! Hifi!" (G.722), 4) "Something open, decent, and from the current century" (Opus). Video is usually H.263 or one of its relatives.
It would be more proper to compare baresip to PJSIP/pjsua. I'm using both and both are great, with multiple unique features. The strong point for pjsip would be in my opinion pjmedia conference bridge (easy to use and powerful mixing, volume control, audio format conversion), this basically has no equivalent in baresip. I also do like PJSIP layered project code organization more. On the other hand baresip is slightly ligher in terms of memory usage (more suitable for e.g. STM32F4). And while PJSIP licensing is not that expensive (considering the size and maturity of the project), baresip would be free most of the times even in commercial projects thanks to BSD license.
quite a few years ago I used this to do dual-way audio communication for an embedded intercom system, worked great, not sure if there are any alternatives came out these years for non-desktop use scenarios.
Interesting. I'm trying to (in background mode) build an open source SIP video doorbell, and so far finding a good client that can work with raw video capture devices proved tricky. All the clients with video support need full X-Windows or Wayland stacks, which I most definitely don't want.
This seems to be exactly what I need!
Now, why did you have to write it in pure C?!? It's not a big deal for me, my VoIP devices are already in a separate VLAN without external access, but still.
Looks like the project started in 2010; there weren't many great options then if you wanted to build something portable, not to mention easily bindable in other languages. (And there are plenty of good rationales for not wanting to use C++.)
Because there are now relatively simple, portable languages that vastly improve on C's safety and virtually non-existing type system, in order to prevent a lot of easy but critical security issues.
I say that as someone who's been writing C for decades, and still does very regularly.
No, they should have definitely thrown away their entire codebase and embarked on a serious migration project to proven industrial-grade solutions like Zig or Nim.
In my experience, it’s the developer that matters. A not-a-very-good-developer can produce buggy code/bad architecture in any of the strongly typed advanced type system languages, too.
It's not so much a threat model, but maintainability and reliability. I want to have as few dependencies as possible, and for the code to work with minimal to no changes for the next 10 years.
Ideally, I would use something like HomeAssistant Operating System for the base, with my code running in a Docker container.
If it's about maintainability and reliability, and not so much a threat model, why does the fact that your VoIP devices are already in a separate VLAN without external access mean it's not a big deal for you?
> I want to have as few dependencies as possible, and for the code to work with minimal to no changes for the next 10 years.
> If it's about maintainability and reliability, and not so much a threat model, why does the fact that your VoIP devices are already in a separate VLAN without external access mean it's not a big deal for you?
VLAN is what makes it not such a big deal. However, I want to offer the same solution to other people in our HOA, and I most definitely don't want to set up VLANs for each one of them.
If you're just receiving and sending video and a bell signal to/from static IP, What problem is SIP solving? I mean can't you just put up gstreamer to take video from an esp32cam?
I can see a benefit here, if you attach your SIP doorbell to a trunk you can configure it to do a full video call to your cell phone over the normal mobile network.
I would personally try to hook into one of the many other apps I have in my phone that support video calling, but SIP shouldn't require an app if your carrier supports ViLTE.